Participants
We initially considered the cohort of 56,013 participants who attended the UKB’s imaging visit (instance 2) between August 2013 and July 2022, with written and informed consent from all participants [23]. We excluded participants with missing genetic information, those residing outside of England, those with a diagnosis of organic brain syndrome or dementia, those with no available imaging-derived phenotypes (IDPs), and individuals without complete demographic information. After applying the exclusion criteria, 29,645 cognitively intact ageing participants were selected and divided into uncorrelated Caucasian (UKB-UC, N = 21,236, for the main analysis), correlated Caucasian (UKB-CC, N = 4,487), and non-Caucasian (UKB-MIX, N = 3,922) subgroups based on the genetic relatedness (data field 22020) and the genetic ethnic background (data field 22006) (see Supplementary Fig. 1 for a flow chart). Age was calculated as the difference between the date of the magnetic resonance imaging (MRI) scan and the date of birth, divided by 365. Descriptions of the variables and the exclusion process are detailed in Supplementary Method 1.
Genotyping and AD-PRS
Genotyping was conducted using a custom Axiom array, with data available for 55,850 participants. The detailed procedures for genotyping, imputation, and quality control have been previously published [24]. The APOE polymorphism was assessed with rs7412 and rs429358. We considered the standard PRS released by the UKB in May 2022 as the measure of AD-PRS [25]. This PRS was computed through a novel algorithm with benchmarked performance, implemented by a Bayesian approach to estimate non-zero weights across approximately 5 million SNPs (including the APOE region) distributed across the genome. The UKB-UC participants were then categorised into low-, intermediate-, and high-risk groups based on tertiles derived from their continuous AD-PRS values.
Brain MRI phenotypes
The multi-contrast brain imaging data utilised in this study were obtained by researchers in the UKB with Siemens Skyra 3T scanners employing the standard Siemens 32-channel head coil. The image acquisition process, automated processing pipeline, and IDP generation are described previously [23]. Briefly, the components of GM, WM, and ventricular cerebrospinal fluid (CSF) were delineated with FAST tissue-segmentation [26] based on the preprocessed T1WI (T1-weighted images, acquired through a 3D magnetisation-prepared rapid gradient echo sequence with 1×1×1 mm resolution). A total of 96 regional GM IDPs were then extracted within the GM component mask based on the Harvard-Oxford atlases, while 14 subcortical GM IDPs were generated with the FIRST program [27]. WM lesion volumes, denoted as white matter hyperintensity volumes (WMHVs), were identified utilizing the BIANCA program [28] with T2-weighted images (fluid attenuated inversion recovery sequence with 1.05×1×1 mm resolution). The microstructural measures of the integrity of WM tracts were obtained from diffusion scans (echo-planar imaging sequence, 50×b = 1000s/mm² and 50×b = 2000s/mm²). The data from the b = 1000 shell were analyzed using DTIFIT to derive tensor-based measures, encompassing fractional anisotropy (FA), mean diffusivity (MD), and mode of anisotropy (MO) [29]. Meanwhile, the data from the two-shell acquisition were fed into the AMICO [30] to generate measures of neurite orientation dispersion and density imaging (NODDI) [31], including intra-cellular volume fraction (ICVF), isotropic or free water volume fraction (ISOVF), and orientation dispersion index (OD). To obtain tract-wise diffusion IDPs, WM tract FA-skeleton analysis was performed utilizing the TBSS program [32], with parcellation defined by 48 standard-space tract masks [33, 34].
We focused on IDPs related to regional GM volume, WMHV, and tract-wise measures of WM, which served as measures of brain structures. All volumetric IDPs were normalised via multiplication with the volumetric scaling factor, from individual native T1 space to standard MNI152 space, estimated through the SIENAX procedure [35]. Prior to the statistical analysis, we adjusted the IDPs with linear models to account for confounding variables, including sex, education level, Townsend deprivation index, hypertensive and diabetic status, scan date, scan centre, and 40 genetic principal components [36].
Neuropsychological testing and cognitive factors
We considered nine tests with robust test-retest reliability [37] from the UKB cognitive test batteries, including the reaction time test (data field 20023), numeric memory test (data field 4282), fluid intelligence (data field 20016), two trail-making tests (TMT-A: data field 6348; TMT-B: data field 6350), pairs matching test (data field 521), matrix pattern completion test (data field 6373), symbol digit substitution test (data field 23324) and tower-rearrange test (data field 21004) (see Supplementary Method 2 for the description of cognitive tests, imputation and CFA details). Missing values were addressed using random forest-based imputations with the mice package [38]. Confirmatory factor analysis was conducted with the cfa function from the lavaan package [39] to identify cognitive factors corresponding to the related domains (Supplementary Table 2). Executive function (EF), working memory (WM), reasoning function (RF), and visuospatial function (VS) factors were extracted with good fits to the data (root mean square error of approximation = 5.81e-03, 90% CI [3.41e-03, 8.31e-03], Tucker–Lewis index = 0.999, 𝜒²(11) = 30.44). Subsequently, the cognitive factor scores were adjusted based on the same models applied to the IDPs prior to further analysis.
Sliding window analysis
Model-free sliding window analysis, which is suitable for analysing large-sample cohorts [40, 41], was applied to delineate ageing trajectories within low, intermediate, and high AD-PRS individuals. For the descriptive ageing trajectories within each risk group, a 1-year fixed-width age window was employed, encompassing 10% of the participants from each risk category. In each window, the visualisation point was sampled as the mean of the corrected IDP and the median age, while the window traversed the age distribution in incremental steps of 0.5 years. A smoothing Gaussian kernel (width = 10) was then applied using the moving average method to improve the visualisation results.
Next, we introduced the intra-age relative difference (IARD) to quantify the age-related differences in brain structures between individuals with high and low AD-PRS. To achieve this, the intra-age response ratio (IARR) was initially computed with the means_ratio function from the effectsize package [42], as the sample-size bias-corrected ratio [43] of the mean of the corrected IDP between the low-risk and high-risk groups within the same age window. The IARD was then calculated as \(IARD = 1 - \frac{1}{IARR}\), with a positive IARD representing a decrease in the IDP value in the high-risk group and vice versa. To finely map the evolution of IARD with age and mitigate window-selection bias, we sampled IARD values across a range of age windows spanning from 5 to 15 years with incrementing steps of 0.1 years.
Statistical analysis
We examined the between-group differences of demographics among AD-PRS stratifications (low, intermediate, and high), by employing one-way analysis of variance (ANOVA) for continuous variables and Pearson's Chi-squared test for non-continuous variables.
The association between the IARD and age was investigated with pooled data across all age windows. Initially, the correlation between the IARD and age was examined through Spearman’s rho. Subsequently, generalised additive models (GAMs) were fitted using the gam function from the mgcv package [44], with a basis complexity of 4 and P-spline smoothing. The estimated age of onset (EOA) was determined as the point at which the fitted IARD value and the first derivative of the IARD value both exceeded 0 (or both being less than zero when the age-IARD was negative). Furthermore, separate linear models were employed to investigate the relationship between the IARD and age for participants aged < 65 years (early-ageing, EA) and > 70 years (late-ageing, LA). The difference between the two slopes was evaluated through the z-statistic, defined as \(z=\frac{{\beta }_{LA}-{\beta }_{EA}}{\sqrt{{SE}_{LA}^{2}+{SE}_{EA}^{2}}}\) [45].
Principal component analysis (PCA) was conducted separately for regional volumetric IDPs and diffusion IDPs, among those IDPs displaying a strong correlation with age and the IARD value (|ρ| > 0.8, determined through visual inspections of the relationshiop between age and IARD across all IDPs). Prior to PCA, minimum-maximum normalisation was applied to the corrected IDPs, and the number of principal components (PCs) was determined through parallel analysis with 10,000 iterations. Subsequently, PC scores were computed using the principal function from the psych package with promax rotation [46]. Finally, 5 GM PCs were extracted from 17 regions, with a cumulative variance of 59.6% (Supplementary Fig. 2); and 10 WM PCs were extracted from the diffusion measures of 32 tracts, with a cumulative variance of 69.9% (Supplementary Fig. 3).
Mediation analyses were next performed with the mediation package [47] to investigate the mediated effects of the AD-PRS on cognitive factors through the PCs (M). We conceptualised the indirect association between the AD-PRS and cognitive factors through M, and the direct and indirect paths related to the AD-PRS were both influenced by age. The model equations were defined as follows: Mz = a1 × PRSC + a2 × ageC + a3 × PRSC × ageC + eMz, and cognitionz = b1 × Mz + b2 × age + b3 × PRSC × ageC + c’ × PRSC + ecognitionz (PRSC and ageC denote the centered AD-PRS and age, respectively, and Mz and cognitionz represent the z scores of M and cognitive factor, respectively). Furthermore, to explore the age-related mediations, we conducted age-moderated mediation analysis through investigating the difference in the average causal mediation effects (ACME, a1b1) at two age strata (mean age ± 1 SD), with the confidence intervals of the difference estimated from nonparametric bootstrap with 10,000 simulations.
All the statistical analyses were performed using R version 4.3.1. Analyses related to the relationship between the IARD score and age were conducted across all IDPs. False discovery rate (FDR) correction was applied to control for false positive results across 125 GM volumes (120 regional volumes and 5 whole-brain volumes) and 48 WM tracts separately. Moderated mediation analyses were performed across PCs and cognitive factors with FDR correction. P-values less than 0.05 after correction were deemed statistically significant.
Sensitivity analysis
Sensitivity analysis was conducted to enhance the robustness of our results. For the consideration of the computational bias in the subcortical volumes, we replicated the analyses using normalised subcortical volumes calculated with the Harvard-Oxford subcortical atlases with the FAST program. Furthermore, to demonstrate the reliability of tract-wise results, analyses were repeated using weighted mean diffusion measures of the 27 major tracts defined by AutoPtx [48] through a probabilistic tractography approach [49].
To address potential bias related to the selection of stratified risk groups and to obviate the effects of APOE polymorphism on the results, we re-investigated the Spearman’s ρ between age and IARD based on different pairs of risk groups (low-intermediate, intermediate-high). Pearson’s R was used to assess the spatial correlations of the ρ from each risk pair with Bonferroni correction applied. Besides, within individuals of different APOE genotypes drawn from the UKB-UC (Nε2ε3 = 2,666, Nε3ε3 = 12,538, Nε3ε4 = 4,939), we employed linear regression models to assess the effects of age, AD-PRS tertile within APOE genotype group, and their interactions on IDPs while accounting for previous covariates. Separate models were fitted for participants aged < 65 and > 70 years to elucidate the effects in different ageing stages, and FDR corrections were applied.
To counteract sample selection bias, we assessed the effects of the AD-PRS, age, and their interactions on the un-adjusted IDPs using linear models, with adjustments to account for predefined confounding variables. These models were performed separately for participants with distinct ethnic backgrounds (UKB-UC, UKB-CC, and UKB-MIX) with FDR correction applied.