2.1 Dataset used
Data used in the analyses reported in this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD.
The ADNI, a longitudinal study which started as a five-year initiative (ADNI-1), has been followed by ADNI Grand Opportunities (ADNI-GO), ADNI-2 and ADNI-3 protocols. The participants, aged 55-90 years, have been recruited at 57 sites in the United States and Canada. The study procedures were approved by the institutional review boards of all participating centers and informed written consent has been provided by all participants.
2.1.1 Sample selection
ADNI data were downloaded on 2018/09/28. 797 of 2046 individuals in the database (39%) had elevated baseline levels of amyloid in the brain (PET florbetapir SUVR > 1.11 13 and/or CSF Aβ1-42 <880 pg/mL 14). Data from these individuals has been used in the present study (subsequently referred to as ‘amyloid positive’ individuals). The longitudinal assessment of cognition was divided into two sub-analyses: i) using baseline pNfl levels, and ii) using baseline ROC calculated utilising longitudinal measures of pNfL within individuals. The first sub-analysis included 735 amyloid positive participants (sample A as shown in Figure 1) with baseline measurement of pNfL and the following longitudinal cognitive assessments: Mini-Mental State Examination, MMSE 15; Alzheimer’s Disease Assessment Scale-Cognitive 13, ADAS-Cog 13 16 ; Clinical Dementia Rating-Sum of Boxes, CDR-SB 17,18; Alzheimer’s Disease Composite Score, ADCOMS 19; Preclinical Alzheimer’s Cognitive Composite including Trail-Making Test B, PACC 20,21. In addition, all individuals had complete data on time (in years) since baseline (e.g. joining the study cohort), age at baseline (years), sex, years of education, and apolipoprotein E (APOE) ɛ4 allele binary status (APOE+ defined as having at least one allele). The second sub-analysis included 336 amyloid positive participants (sample B in Figure 1) with two or more measurements of pNfL within 48 months from baseline, repeated measures on the cognitive scores after 48 months since baseline and covariates mentioned above. The dataset used to compare the ROC of pNfL and the ROC of various imaging markers of AD included 236 amyloid-positive participants (sample C in Figure 1) with two or more measurements of pNfL (pg/mL), ventricular volume (mm3), hippocampus volume (mm3), whole brain volume (mm3) 22, and amyloid accumulation (Standard Uptake Value Ratio SUVr) using Florbetapir in PET scans (AV-45-PET) 23 within 48 months since baseline to make the analysis comparable and within a relatively short time period so that assumptions of constant rate of change between time points are more robust. All volumetric measures were adjusted for total intracranial volume 24. For the development of the pNfL trajectory, we utilised data from 501 individuals (sample D in Figure 1) who had two or more measurements of pNfL, ADAS-Cog13 and MMSE (see Figure 1).
2.2 Statistical Analysis
2.2.1 Association between pNFL and longitudinal cognitive decline (Sample A and B)
We analysed the association between cognitive scores over time (outcome) and continuous baseline pNfL (log10 transformed) using linear mixed effect (LME) models and beta-regression, with random intercepts and slopes to account for repeated outcome measures. Cognitive scores are generally bounded between two numbers, therefore assuming that the errors follow a normal distribution is not appropriate. Beta regression overcomes this by assuming that the errors follow a beta distribution, which can take various shapes depending on the values of its two parameters 25. Both the LME and beta-regression models were adjusted for time (years) since baseline and the following potential confounders: baseline age (years), sex (binary classification), education level (years), and APOE ɛ4 allele status (binary). A quadratic term for time which improved the model fits (as measured by significance of additional term) was included in the linear mixed effects model. Baseline pNfL values were also grouped into quartiles for visualisation. These were then used to plot the cognitive trajectories corresponding to the baseline quartiles.
Using data collected between baseline and month 48, we also calculated the average ROC in the pNfL measurement using only the first three visits at which pNfL was measured; we define this ROC as the ‘baseline ROC’. The method used for calculating an individual’s ROC is explained in the following paragraph. The last visit used to calculate the baseline ROC was then considered as the new baseline for the subsequent visits at which cognitive scores were measured. In other words, cognitive data from the period used to calculate the baseline ROC was discarded and subsequent analysis were based purely on longitudinal cognitive data recorded after the baseline ROC period. We then tested the hypothesis - is this baseline ROC of pNfL associated with longitudinal cognitive decline - using linear and beta mixed effects models in the same way as explained above, adjusting for the same covariates. Corresponding plots using quartiles of baseline ROC were created.
2.2.2 Rate of change over time in pNfL and AD-related biomarkers (Sample C)
In order to estimate the ROC of pNfL for each individual in the cohort, we employed log10 transformed pNfL values (scaled to have zero mean and unit standard deviation) as the outcome and years since baseline as the predictor. Random intercepts and random slopes were included to allow each individual’s trajectory over time to deviate from the overall trend. The individual-specific posterior estimates of the random slopes were used as the measure of the ROC in pNfL over time. The same method was used to calculate the ROC in the imaging markers - ventricular volume, hippocampus volume, whole brain volume, Florbetapir PET SUVr. We then performed linear regression between the estimated ROC in pNfL and ROC in these volumetric brain structure markers and Florbetapir PET SUVr, accounting for the following time-invariant covariates: baseline age, sex, years of education, and APOE ɛ4 allele binary status.
2.2.3 Temporal dynamics of pNfL and other markers (Sample D)
For the development of biomarker trajectories, we used a method based on differential-equations described in previous studies 26. We made two main assumptions: 1) all individuals will follow the same general pattern of biomarker trajectory during the progression of AD, and 2) the biomarker levels either increase or decrease monotonically with time. For each individual rate of change in the level of a biomarker, as estimated in Section ‘Rate of change over time in pNfL and AD-related biomarkers (Sample C)’, we first calculated the corresponding mean biomarker level (mean of the biomarker values used for the calculation of the rate of change). A quadratic function was fitted to the data to describe the ROC as a function of the mean. The function that described best the relationship between ROC and mean values was then integrated to produce the average long-term temporal dynamics of the biomarker. For the integration, we assumed that at time zero the mean value of the biomarker is equal to the average value at the point where the CDR-SB score becomes bigger than zero in the data, i.e. first symptomatic cognitive impairment as assessed by CDR-SB is observed 27. Hence, without loss of generality, we enabled a function transformation so that the biomarker change is described as a function of time from the first non-zero CDR-SB score. Using the non-linear least squares method, the individual biomarker values were then synchronised to match the population level trajectory.
Data formatting and plotting was performed in R version 4.0.3 and R studio version 1.3.1093. The posterior distributions of the parameters and 95% credible intervals (CrI) were estimated using Hamiltonian Monte Carlo (HMC) through the rstan interface 2.21.0 28 which uses the “No-U-Turn-Sampler”. The Gelman-Rubin statistic Rhat <1.1 was used as an indicator for convergence of parameter chains. To infer the model parameters, we report the posterior parameter mean estimates and their respective 95% CrIs. The temporal dynamics of biomarkers as a function of time from first CDR-SB>0 have been estimated in MATLAB R2019a.