Pathway-specific polygenic risk scores correlate with clinical status and Alzheimer’s-related biomarkers

Background: APOE is the largest genetic risk factor for sporadic Alzheimer’s disease (AD), but there is a substantial polygenic component as well. Polygenic risk scores (PRS) can summarize small effects across the genome but may obscure differential risk associated with different molecular processes and pathways. Variability at the genetic level may contribute to the extensive phenotypic heterogeneity of Alzheimer’s disease (AD). Here, we examine polygenic risk impacting specific pathways associated with AD and examined its relationship with clinical status and AD biomarkers of amyloid, tau, and neurodegeneration (A/T/N). Methods: A total of 1,411 participants from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) with genotyping data were included. Sets of variants identified from a pathway analysis of AD GWAS summary statistics were combined into clusters based on their assigned pathway. We constructed pathway-specific PRSs for each participant and tested their associations with diagnostic status (AD vs cognitively normal), abnormal levels of amyloid and ptau (positive vs negative), and hippocampal volume. The APOE region was excluded from all PRSs, and analyses controlled for APOE-ε4 carrier status. Results: Thirteen pathway clusters were identified relating to categories such as immune response, amyloid precursor processing, protein localization, lipid transport and binding, tyrosine kinase, and endocytosis. Eight pathway-specific PRSs were significantly associated with AD dementia diagnosis. Amyloid-positivity was associated with endocytosis and fibril formation, response misfolded protein, and regulation protein tyrosine PRSs. Ptau positivity and hippocampal volume were both related to protein localization and mitophagy PRS, and ptau positivity was additionally associated with an immune signaling PRS. A global AD PRS showed stronger associations with diagnosis and all biomarkers compared to pathway PRSs, suggesting a strong synergistic effect of all loci contributing to the global AD PRS. Conclusions: Pathway PRS may contribute to understanding separable disease processes, but do not appear to add significant power for predictive purposes. These findings demonstrate that, although genetic risk for AD is widely distributed, AD-phenotypes may be preferentially associated with risk in specific pathways. Defining genetic risk along multiple dimensions at the individual level may help clarify the etiological heterogeneity in AD.

identifying 75 risk loci (7). Finding effective treatment for AD remains elusive, and there is increasing focus on its heterogenous presentation, both clinically and at the level of pathobiological mechanisms.
Ultimately, identifying the sources of genetic risk for AD may not only shed light on the pathobiology of the disease but also lead to novel drug targets.
While large-scale genome-wide association studies (GWAS) continue to identify speci c risk loci, the polygenic nature of AD suggest the possibility that informative genetic signals may fall beneath the genome-wide signi cance thresholds. One approach to capturing these weak associations is to construct polygenic risk scores (PRSs) by taking the sum of all putative risk variants, de ned broadly, weighted by their effect size from independent GWAS, assigning each individual a score, and then testing the association of this score with diagnosis and related phenotypes (8). For genetically complex diseases such as AD, PRSs have been shown to strengthen AD diagnostic classi cation beyond the use of APOE genotypes (9) and have further been shown to be associated with brain structure, Aβ and tau pathology, and cognitive decline (10)(11)(12)(13). AD PRSs have also been shown to be associated with increased risk for mild cognitive impairment among individuals in their 50s (14), and have even been shown to be associated with brain structure in young adults (15,16), demonstrating their utility across multiple age ranges.
A bene t of PRSs is that they provide a global summary measure that aggregates the large and small effect sizes of different variants across the genome. However, aggregating the effects of individual loci may obscure distinct sources of risk. One approach to overcome this limitation has been to calculate PRS with the APOE region removed to demonstrate effects of the APOE gene and APOE-independent variants (17)(18)(19). In this light, AD risk genes identi ed through GWAS have been associated with a number of pathways, such as immune function, cholesterol transport, mitochondrial function, protein-lipid complex, and endocytosis (4,6,7,20,21), and the APOE gene itself impacts a variety of processes (22). Two individuals may therefore have similar scores on a global PRS with very different risk associated with the underlying pathways that are perturbed as a result. It is well documented that AD demonstrates substantial heterogeneity with respect to its clinical presentation (23)(24)(25), but also in the distribution of associated pathology and atrophy. For example, potential subtypes have been identi ed using AD-related biomarkers, including those associated with amyloid (26), tau (27)(28)(29)(30), and neurodegeneration (31)(32)(33)(34)(35).
Breaking down global AD PRS into pathway-speci c PRSs may be one approach to better understand the etiology of AD and its heterogeneity. Several studies have examined pathway-speci c PRS in the context of AD but have typically focused on associations with diagnostic status or considered only genome-wide signi cant variants (19,(36)(37)(38)(39). However, previous studies have highlighted the importance of including larger numbers of variants, including those below the level of genome-wide signi cance in relevant analyses (9,14). In addition, there is a growing interest in characterizing individuals based on AD-related pathology involving amyloid, tau and neurodegeneration (i.e., the A/T/N classi cation system) (40) that are also likely to have a polygenic basis. As a result, we have examined associations between pathwayspeci c PRS with dementia status and A/T/N biomarkers to better characterize genetic in uences on each.

Participants
Data used in the preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD.
Our analyses included data on 1,411 participants from the ADNI-1 (n = 699), ADNI-GO/2 (n = 406), and ADNI-3 (n = 306) cohorts of European ancestry. The individuals in these cohorts had genome-wide genotype data that underwent rigorous quality control lters. Our analyses focused on cognitively unimpaired (CU) participants, participants with mild cognitive impairment (MCI), and individuals with AD dementia according to baseline ADNI diagnosis in the cohorts. Procedures were approved by the Institutional Review Board of participating institutions and informed consent was obtained from all participants.
The resulting gene sets contain a high degree of redundancy, so we generated "pathway clusters" comprising gene sets with a high proportion of overlapping genes. First, the Cytoscape app EnrichmentMap v3.3 (50) was used to generate networks with gene sets as nodes and proportion of overlapping genes as edges. A gene set threshold of FDR q < 0.25 and overlap threshold of 0.5 were used as node and edge parameters, respectively. A permissive threshold was chosen to consider a broader set of potential risk pathways. Next, the AutoAnnotate app (http://baderlab.org/Software/AutoAnnotate) was used to cluster nodes with the MCL Cluster algorithm. Variants in genes belonging to a pathway cluster were used to calculate pathway-speci c PRSs.
Pathway-speci c polygenic risk scores Pathway-speci c PRSs were calculated based on summary statistic effect sizes from the (4) Alzheimer's disease GWAS using the PRSet function in PRSice-2 v2.3.5 (51). Each pathway PRS was calculated from SNPs mapped to genes contained in each pathway cluster with a 35-kb upstream/10-kb downstream window. A global PRS considering all SNPs contained in the summary statistics was also calculated.
Prior to scoring, SNPs with MAF < 0.01 and imputation quality R 2 < 0.5 were excluded from the analysis.
LD clumping (r 2 threshold of 0.2 in a 500 kb window) based on LD patterns in the 1000 Genomes EUR cohort was used to restrict scoring to independent loci. To determine the effect of PRSs independent of APOE, the region surrounding the APOE genes was removed (chr19:45,116,911-46,318,605 according to GRCh37). All remaining SNPS (i.e., p < 1) were used in scoring.

Measures of amyloid, tau, and neurodegeneration
We explored associations between pathway-speci c PRS and CSF and PET measures of amyloid and tau. We considered the results of CSF and PET measures as both are indicative of the presence of pathology, so we combined classi cations from each to maximize the sample size for our analyses. Individuals were classi ed as Aβ and tau-positive based on having abnormal levels from at least one CSF or PET measure at a given data collection timepoint (e.g., a classi cation of Aβ + could be based on abnormal levels of amyloid from either a CSF or PET assessment, or both).
Neurodegeneration was indexed by the ratio of hippocampal volume to intracranial volume. As there is no established cut-off for abnormal hippocampal volume, this was used as a continuous measure in all analyses.

Statistical analysis
All analyses were conducted with R v 4.2.1 (58). Differences in demographic variables were tested with ttests for continuous variables and chi-squared tests for categorical variables. Associations with baseline diagnostic status restricted to cognitively unimpaired (CU) participants and those with dementia at their baseline visit and were tested using logistic regressions with diagnosis (CU vs dementia) as outcome. Associations with A/T/N biomarkers included participants of all diagnostic categories and used the rst timepoint at which biomarker measures from all three categories were available. For amyloid and tau, our logistic regression models took biomarker abnormality status (positive vs negative) as the dependent or outcome measure of interest. For hippocampal volume, linear regression used the ratio of hippocampal volume to ICV as the outcome. Magnet eld strength was also included as a covariate in analyses of hippocampal volume. Separate models were run with each PRS as predictor. The effect of APOE was assessed by including number of APOE-ε4 alleles (0, 1, or 2) as a separate variable. Models additionally included age, gender, and the rst three principal components of the ADNI cohort genetic relationship matrix to control for any cryptic population strati cation (59). Several post-hoc analyses were run to provide additional context to the PRS effect. Participants were strati ed by APOE-ε4 into carriers and noncarriers, and PRS effects were tested separately in each group. Differences between these groups were directly tested with an interaction between APOE-ε4 carrier status and pathway PRS. To determine whether pathway-speci c PRS in aggregate would increase predictive power, we t models for each outcome (i.e., clinical status and biomarker abnormality) that included all pathway-speci c PRS in the same model. The t of these models were compared to models that included only the global PRS using Vuong's likelihood ratio test (60). We corrected for multiple comparisons using Benjamini-Hochberg false discovery rate (FDR)-adjustment (61).

Pathway analysis
The results of MAGMA analysis of the Alzheimer's GWAS summary statistics suggested that several pathways were signi cantly enriched among associated variants -all surviving FDR correction for multiple comparisons (Supplementary Table S1). These included negative regulation of amyloid precursor protein catabolic process (GO:1902992), regulation of aspartic-type peptidase activity (GO:1905245), negative regulation of cellular component organization (GO:0051129), negative regulation of amyloid-beta formation (GO:1902430), and regulation of humoral immune response mediated by circulating immunoglobulin (GO:0002923). A number of other genesets were nominally signi cant (p < 0.05, uncorrected) and those with FDR q < 0.25 were included in clustering (Supplementary Table S1).
Clustering yielded 13 pathway clusters: protein localization (including regulation of amyloid-beta and tau protein kinase activity), cholesterol transport, amyloid protein processing, immune signaling, in ammatory response (including microglial activation), endocytosis and bril regulation, humoral immune response (including regulation of complement activation), receptor metabolic process, responses to misfolded protein, phototransduction, regulation of cell junction assembly, regulation of protein tyrosine, and mitophagy. Variants in the enriched pathways were used to construct the pathway-speci c PRS. Supplementary Table S2 lists the gene sets included in each of the pathway clusters.

Associations with diagnostic status
Sample characteristics of participants included in the analysis of diagnostic status are listed in Table 1, and full results are shown in Fig. 1 and Supplementary Table S3. In models including PRS and APOE-ε4 status, a higher global PRS (β = 1.91, t-value = 11.01, p < 0.001) and number of APOE-ε4 alleles (β = 0.94, tvalue = 8.60, p < 0.001) were signi cantly associated with an Alzheimer's dementia diagnosis. Eight of the 13 pathway PRSs were signi cantly associated with diagnostic status after correction for multiple comparisons. These included: protein localization, cholesterol transport, amyloid protein processing, immune signaling, endocytosis and bril regulation, regulation cell junction, regulation protein tyrosine, and mitophagy. When examining APOE-ε4 non-carriers only, results were similar except that cholesterol transport and regulation protein tyrosine were no longer signi cant whereas the receptor metabolic process PRS went from non-signi cant to signi cant. In APOE-ε4 carriers, only the global, regulation cell junction, and regulation protein tyrosine PRSs were signi cant. The effects of the global PRS (β=-0.54, t-value=-2.10, p = 0.035) and protein localization PRS (β=-0.45, t-value=-2.41, p = 0.016) were signi cantly weaker in APOE-ε4 carriers than non-carriers, whereas the phototransduction PRS was stronger in APOE-ε4 carriers (β = 0.39, t-value = 2.17, p = 0.030). However, these did not survive correction for multiple comparisons. Associations with amyloid positivity Sample characteristics of participants included in the analysis of biomarkers are listed in Table 2, and full results of the associations with amyloid positivity are shown in Fig. 2 and Supplementary Table S4. In models including PRS and APOE-ε4 status, a higher global PRS (β = 0.45, t-value = 4.21, p < 0.001) and number of APOE-ε4 alleles (β = 1.14, t-value = 9.18, p < 0.001) were signi cantly associated with amyloid positivity. Three of the 13 pathway PRSs were signi cantly associated with amyloid status after correction for multiple comparisons. These included: endocytosis and bril regulation, response misfolded protein, and regulation protein tyrosine. When examining APOE-ε4 non-carriers only, the global PRS as well as the endocytosis and bril regulation PRS were signi cant. In APOE-ε4 carriers, only the global and regulation protein tyrosine PRSs were signi cant. The effect of the regulation protein tyrosine PRS was signi cantly stronger in APOE-ε4 carriers than non-carriers (β = 0.52, t-value = 2.55, p = 0.010), but this did not survive correction for multiple comparisons.

Associations with tau positivity
Full results of the associations with tau positivity are shown in Fig. 3 and Supplementary Table S5. In models including PRS and APOE-ε4 status, a higher global PRS (β = 0.43, t-value = 4.25, p < 0.001) and number of APOE-ε4 alleles (β = 0.62, t-value = 6.63, p < 0.001) were signi cantly associated with tau positivity. Three of the 13 pathway PRSs were signi cantly associated with tau status after correction for multiple comparisons. These included: protein localization, immune signaling, and mitophagy. When examining APOE-ε4 non-carriers only, results were similar except that the mitophagy PRS was no longer signi cant, whereas the in ammatory response PRS became signi cant. In APOE-ε4 carriers, neither the global PRS nor any of the pathway PRSs were signi cant. The effect of the immune signaling PRS was signi cantly weaker in APOE-ε4 carriers than non-carriers (β=-0.66, t-value=-3.72, p < 0.001), but this did not survive correction for multiple comparisons.

Associations with hippocampal volume
Full results of the associations with hippocampal volume are shown in Fig. 4

Discussion
The current results support and extend previous work disentangling the biological pathways contributing to Alzheimer's disease risk and pathogenesis. Consistent with previous ndings, we found that a global AD PRS was signi cantly associated with diagnostic status, amyloid and tau positivity, and hippocampal volume (9-13, 15, 17). The global AD PRS captures the combined effects of multiple separable in uences on disease risk and therefore is not useful for teasing apart genetically-mediated etiological differences among individuals. Several studies have examined the relationship of AD diagnosis or ADrelated biomarkers with pathway-speci c PRS calculated from GWAS-signi cant SNPs (36, 37,39). Here, we generated pathway-speci c PRSs from clusters of gene sets and SNPs with association strength pvalues falling below the threshold of GWAS signi cance. Breaking down the global effects of polygenic factors into more re ned genetic pathway-associated subset of polygenes has been shown to provide useful information above-and-beyond variants with more pronounced effects arising from GWAS for certain conditions, including AD and MCI (8, 9,14).
As expected, most of the pathway PRSs associated with AD diagnostic status in the current study correspond to pathways that have consistently been uncovered in previous GWAS. These pathways include amyloid precursor processing, immune and microglial response, endocytosis, cholesterol transport, lipid-protein complex and amyloid clearance (4,6,7,21). Several pathways of focus in our study, however, had less support from other GWAS, but have nonetheless been linked to AD in yet other studies. For example, the regulation cell junction and regulation protein tyrosine pathway-speci c PRS we considered may capture effects of genes involved in synaptic functioning and cell signaling, consistent with studies suggesting this pathway is involved in AD-related cognitive decline (62, 63). In addition, mitochondrial function has been proposed as playing a key role in the development of AD (64, 65), and mitophagy in particular may have widespread impacts on age-related disease, including Alzheimer's disease (66). Our results are also consistent with studies that have examined the association of pathwayspeci c PRS with AD diagnosis, including Aβ clearance, cholesterol transport, immune response, and endocytosis (19,36,37,39) Amyloid positivity was signi cantly associated with pathway-speci c PRS for endocytosis and bril regulation, response to misfolded proteins, and regulation of protein tyrosine. The pathways associated with these PRSs are involved in the production, tra cking and clearance of Aβ peptides, as well as their aggregation into brils, which has biological plausibility. The endocytic pathway plays a key role in the amyloidogenic processing of APP as it is internalized to the intracellular space followed by cleavage into Aβ in the early endosome (67, 68). Tyrosine kinases may be involved in both the tra cking of APP and upregulating BACE activity (69, 70). In addition, genes encompassed by the response misfolded protein PRS that we nd associated with AD pathology, include molecular chaperones (e.g., CLU) and the ubiquitin-proteasome system, which mediate degradation of abnormal and misfolded proteins (71)(72)(73). A previous study examining pathway-PRS found that PRS related to Aβ clearance and cholesterol metabolism were also related to CSF and PET measures of amyloid; however, these scores only included GWAS-signi cant variants and the effects were primarily driven by APOE (37).
We also found that tau positivity was signi cantly associated with pathway PRSs for protein localization, immune signaling, and mitophagy. The protein localization pathway includes tau protein kinase activity, which may relate to abnormal hyperphosphorylation of tau (74). Sun et al. (75) also found that a PRS re ecting the tau kinase activity was associated with CSF and PET measures of tau. Tau binds to microtubules to provide stabilization, but detaches when phosphorylated which can reduce axonal integrity and disrupt protein transport along the cytoskeleton (76). Additionally, this unbound phosphorylated tau can aggregate into neuro brillary tangles (77). Regarding the PRS re ecting immune signaling, although in ammation can be secondary to abnormal tau, there is also evidence for an upstream role of glial activation and neuroin ammation in driving the accumulation and spread of tau (78, 79). This is consistent with a previous nding that a PRS constructed from only GWAS signi cant variants related to immune response was associated with CSF tau (37). As with in ammation, disruptions to mitophagy may be secondary to disease processes, but there is evidence for upstream roles of mitochondrial function, including mitophagy, in the development of AD pathology (65, 80). Hippocampal volume was also associated with pathway PRSs for protein localization and mitophagy. Shared pathways with tau positivity and may re ect the tighter linkage between neurodegeneration and tau compared to amyloid (81, 82).
Our results are generally consistent with those of a recent GWAS on CSF measures of amyloid and tau (83). The authors found little overlap in the loci associated with amyloid and tau aside from APOE. In contrast, there was overlap between loci associated with tau and ventricular volume, which can be used as an indicator of neurodegeneration. The authors also examined associations of AD-associated variants with CSF amyloid and tau, and a cluster analysis pursued by them identi ed patterns that were broadly similar to our own. For example, variants associated with CSF amyloid were associated with amyloid processing, endocytosis, and tyrosine kinase, whereas variants associated with CSF tau were linked to the immune system. Deriving PRSs based on GWAS of AD biomarkers represents a promising approach to index risk in speci c pathways, but sample sizes for such studies are relatively small. PRSs based on pathway analysis of diagnosis-based GWAS are therefore a useful alternative that can leverage large case-control datasets to provide converging information.
Although we focused on APOE-independent sources of AD risk, it is important to note that the number of APOE-ε4 alleles had a comparable or even stronger (in the case of amyloid) effect on the outcomes as the polygenic component indexed with a PRS. APOE belonged to gene sets that were part of several cluster, including endocytosis and bril regulation, protein localization, cholesterol transport, and amyloid protein processing. The variants falling within the APOE region were excluded from our PRSs, but this does suggest APOE can exert an impact through multiple routes. Stratifying based on APOE-ε4 carrier status suggested stronger effects of pathway PRSs on tau positivity in non-carriers. It may be that smaller polygenic effects are obscured in the presence of a larger APOE-related signal. Alternatively, APOE-ε4 may be su cient to increase risk for tau pathology whereas those lacking this risk allele require additional sources of risk to develop abnormal levels of tau.
There are several additional items worth mentioning to put our analyses into context. First, we nd evidence that the effect size of the global PRS is much larger than any pathway-speci c PRS effect sizes in our analyses. While pathway-speci c PRSs may be bene cial for understanding disease etiology, they do not appear to add predictive power when considered in the aggregate over-and-above the global PRS. Second, we mapped SNPs to genes using the standard position-based approach available in MAGMA. However, many GWAS SNPs are located in non-coding regions and may be associated with disease risk through their gene regulatory effects (84). Thus, approaches that make use of information such as chromatin interactions (85) or expression quantitative trait loci (86) to determine which gene a variant present in a non-coding region affects may prove useful in developing pathway-speci c PRS. Third, genes/SNPs may be part of multiple pathway clusters, so the pathway PRSs are not entirely independent with each other. Fourth, we added interactions with diagnosis to all biomarker models to determine whether PRS associations differed by group. However, no interaction terms were signi cant after correcting for multiple comparisons and there did not appear to be a consistent pattern among the few interactions that reached nominal signi cant. Finally, this analysis required choices of parameters at various steps, such as the p-value thresholds used to lter variants and gene sets, method used to construct PRS, and even which gene sets were used. We believe our choices represent a reasonable attempt to capture the broad sources of polygenic in uence on AD risk while minimizing unrelated signal. However, alternative approaches may be equally valid and should be determined by the context of a given analysis.

Conclusions
Ultimately, we nd evidence that some pathway-speci c PRSs are associated with AD diagnostic status and A/T/N biomarkers. Our ndings indicate that genetic risk for AD may exist along multiple dimensions, and the distribution of risk across pathways may in uence phenotypic manifestations of the disease. Although a global PRS appears to provide superior predictive power overall, pathway-speci c PRS analysis may help clarify aspects of the heterogeneity of AD pathogenesis.

Competing Interests
The authors declare that they have no competing interests.
. Gerring ZF, Mina-Vargas A, Gamazon ER, Derks EM. E-MAGMA: an eQTL-informed method to identify risk genes using genome-wide association study summary statistics. Bioinformatics. 2021;37(16):2245-9. Figure 1 Associations of polygenic risk scores with diagnostic status. Logistic regressions were used with diagnostic status (cognitively unimpaired vs dementia) as the outcome. Separate models used each PRS as predictor. Models either 1) included number of APOE-ε4 alleles as a separate variable, 2) tested only APOE-ε4 non-carriers, or 3) tested only APOE-ε4 carriers. All models adjusted for age, gender, and the rst 3 genetic principal components. Plots show standardized regression coe cients (log-odds) and standard errors. Associations that survived FDR correction are bolded.

Figure 2
Associations of polygenic risk scores with amyloid positivity. Logistic regressions were used with amyloid status (positive vs negative) as the outcome. Separate models used each PRS as predictor. Models either 1) included number of APOE-ε4 alleles as a separate variable, 2) tested only APOE-ε4 non-carriers, or 3) tested only APOE-ε4 carriers. All models adjusted for age, gender, and the rst 3 genetic principal components. Plots show standardized regression coe cients (log-odds) and standard errors. Associations that survived FDR correction are bolded.

Figure 3
Associations of polygenic risk scores with ptau positivity. Logistic regressions were used with ptau status (positive vs negative) as the outcome. Separate models used each PRS as predictor. Models either 1) included number of APOE-ε4 alleles as a separate variable, 2) tested only APOE-ε4 non-carriers, or 3) tested only APOE-ε4 carriers. All models adjusted for age, gender, and the rst 3 genetic principal components. Plots show standardized regression coe cients (log-odds) and standard errors. Associations that survived FDR correction are bolded.