We created two trial constructsbased oncommon characteristics of the AD randomized controlled trials, one for a symptomatic treatment and one for a disease modification treatment.Each construct was modeled after a similarsymptomatic or disease modification treatment trial that had been conducted by the ADCS. Withinbothtrial constructs, we assumed that in-person visits were paused for a 6-month period from the date of astrict stay-at-home order until the order wasrelaxed, and in-person visits could resume. Within each construct, we assessed three representative hypothetical scenarios that differed in modifications made in response to the abrupt interruption of in-person visits. Simulationsof the three scenarios were performed usingdata from past ADCS trials to investigate the impact of trial modifications on the power to detect a specified treatment effect. The details of the trial constructs, including trial status at the time the stay-at-home order was issued; protocol modifications due to the stay-at-home order; and specific modeling assumptions and procedures are described below and summarized in Table 1.Assumptions regarding COVID-related trial modifications differed in the two trial constructs, therefore different approaches and methods were used in simulations. All simulations were conducted in R version 3.6.3.
Trial Construct 1 – Symptomatic Trial
The first hypothetical trial construct is conceptualized as a phase 2 proof-of-concept, 12-month,randomized, double-blind, placebo-controlledtrial of an oral agent for mild-to-moderateADwithchange on the Alzheimer’s Disease Assessment Scale-Cognitive (ADAS-cog)asthe primary outcome measure. Assessments were planned at baseline, 3 months, 6 months, and 12 months (endpoint). The trial was poweredwith 0.82 statistical power (α = 0.05) to detect a 2.6-point ADAS-cog differencebetween baseline and 12-month (endpoint) scores in a two-sided, two sample t-test analysis with 180 individuals required per arm (drug or placebo).We assumed that at the time of the pause the trial was fully enrolled: a total of 360 individuals had been randomized to drug or placeboand had completed a baseline evaluation; 97.5% of participants had completed 3-month follow-up; 67% had completed 6 months; 45% had completed 9 months; and 22% had completed the planned 12-month endpoint.We modeled three potential scenarios in response to the pandemic stay-at-home order.
Scenario 0. (“Trial as Planned”) We includedfor reference the “trial as planned” scenario that assumed no interruption to in-person assessments to confirm thatoriginal power was near 80%.
Scenario 1. The trial was abruptly ended on the date of the stay-at-homeorderand no further outcome data were collected. In this base condition, the hypothetical trial was truncated and data available as of the date of the stay-at-home order were analyzed.
Scenario 2. Trial medications continued throughout the pause; outcome assessments were stopped for 6 months and then resumed after the pause; windows for outcome assessments were not extended, so assessments that were due during the pause were missing. Because there was no extension for visits to occur outside of the12 month trial window, the 45%of individuals who had completed through month 9before the pause did not have a chance for another assessment within the 12 month window; the31.5%ofindividuals whocompleted only through month 6 before the pause did not havea month 9 assessment but did have a month 12 (endpoint) assessment; and the 2.5%of individuals whocompleted only through month 3 did not have a month 6 assessment butdid have month 9 and month 12(endpoint) assessments.
Scenario 3.Trial medications continued throughout the pause;outcome assessments were stopped for 6 months and then resumed after the pause; the window to complete the 12-month endpoint assessment was extended by 3 months so that endpoint outcomes could be assessed in-person. Extending the window for the 12-month endpoint assessment represented a protocol change thatallowedpatients to be on study drugfor up to 15months.As a result, 22% of individuals would have been treated for the planned12 months prior to the pandemic (i.e., completed before the pause) the remainder of study participants could have been treated for a longerperiod of up to15 months.
Statistical Methods for Trial Construct 1. For each scenario, we performeda simulation to calculate power to detect each of three different 12-month ADAS-Cog change score effect sizes for placebo vs. treatment (2.0, 2.5, and 3.0ADAS-cog points, pooled standard deviation (SD) fixed at 6.0)usingthree different analysis methods (linear-mixed effects with categorical time, linear mixed effects with continuous time, and Student's t-test) in three different sample sizes (the planned sample size of 360 and two others: 320, 400). Simulationswere done with MonteCarlo methods applied toa pooled dataset of ADAS-cog scoresfrom 641participants who completed the ADCShomocysteine and lipid-lowering trials.1, 2
As a first step, a least squares slope statistic was computed for the ADAS-Cog scores of each participantin the pooled dataset as a measure of relative disease progression over a 12-month study period. This set ofslopes was ordered from smallest to largest. The following algorithm was then applied.1) AnN/2 sized subset of theseslopes was selected at random without replacement using sampling weights biased toward larger slopes.These slopes represented the"placebo group.”2) Asecond N/2 sized subset of slopes was selected in a similar manner but usingsampling weights biased toward smaller slopes. These slopes representedthe "active drug group."The weight distributions for “placebo group” and "active drug group" sample selection are shown in Figure 1. 3) The sampling procedure was repeated to allow subsets of slopes to be drawn with means and SDsfor ADAS-cog change differences (i.e., active vs. placebo effect size) centered around the targeted ADAS-cog change difference score(i.e., 2.0, 2.5, or 3.0ADAS-cog points) and SD (i.e.,6.0 SD), andthis repetition continued until the observed mean ADAS-cog change difference scoreand SDwere within 0.001 of the targetedeffectsize and SD.This method of creating a data set was repeated until we obtained 20 data sets of sizeN withthe specified observed between group ADAS-Cog change difference (i.e.,2.0, 2.6 or 3.0)and SD(i.e., 6.0).
As a second step, afiltering process was applied to each dataset such that 30% of subjects were randomly selected to be dropped from thestudyprior to the12-month endpoint. The timing of thedropouts was uniformlydistributed over the12-monthtrial period.Following application of the dropout filter,one of three additional filterswas applied to represent one of the COVID response modification scenarios (plus the “as-planned” scenario)described above. The 20 data sets for each combinationof sample size, effect size, andscenario were submitted to abootstrap analysis with 1000 replicates to determine the statistical power of each of threestatistical testing procedures:a) linear-mixed effects with categorical time,b) linear mixed effects with continuous time, and c) Student's t-test. The bootstrap process involved resampling the dataset multiple timeswithout replacement and calculating the three test statistics and theirassociated P values. The proportion of P values less than 0.05 over the 1000replicates provided the power estimate.
Trial Construct 2 – Disease Modification Trial
The second hypothetical trial construct is conceptualized as a phase 2b or 3 disease modificationtrialwith monthly intravenous infusions ofa monoclonal antibody targeted to ADpathology. It is an 18-month,randomized, double-blind, placebo-controlledtrial forearly-stageAD (i.e., MCI due to AD, mildAD dementia) with the ADAS-cog as the primary outcome. Evaluations were planned for baseline, 3 months,6 months, 12 months, and 18 months (endpoint). The trial waspower with 0.80 statistical power (α = 0.05) to detect a 1.85-point ADAS-cog differencebetween treatments at 18 monthsusing a two-sided, two sample t-test analysis with140individuals per arm(drug or placebo). We assumed that at the time of the pause that the trial was fully enrolled: a total of 280 individuals had been randomized to drug or placebo and had completed a baseline evaluation; 80% had completed 3-month follow-up; 50% had completed 6 months;25% had completed 12 months; and 12% had completed the planned 18-month endpoint.We modeled three potential scenarios in response to the pandemic stay-at-home order.
Scenario 0 (“As Planned”). We includedfor reference the“as planned” scenario that assumed nointerruption to in-person assessments to confirm thatoriginal power was near 80%.
Scenario 1. The trial was abruptly ended on the date of the stay-at-home-order and no further outcome data were collected. In this base condition, the hypothetical trial was truncated and data available as of the date of the stay-at-home order were analyzed.
Scenario 2. Medication and assessmentswere stopped for 6months and then resumed after the pause. This resulted in missed medication and assessments for visits planned during the pause creating a condition where approximately20% of individuals who only completed baseline missed their month 3 and 6 outcome assessment, approximately 30% of individuals who had completed month 3 missed their month 6 outcome assessment,approximately 25% who completed month 6 had missed month 12, and approximately 13% who completed month 12 missed month 18.We also assumed that 24% discontinued before or on the dateof the pause.
Scenario 3. Trial medication was stopped during the 6-month pause, but the cognitive outcome measurecontinued to beassessed remotely on the planned schedule.The impact of remote assessment was modeled by adjusting ADAS-Cog scores by an increase (worsening) of 0.5 points.Medication was resumed after the pause.This created a conditionsimilar to scenario 2’s scheme, but they had remote assessments rather than missing the assessments.We again assumed that 24% discontinued before oron the dateof the pause.
Statistical Methods for Trial Construct 2. For each scenario, we performed a simulation to calculate power to detect each of four different ADAS-Cog effect sizes for placebo vs. treatment (1.5, 2.0, 2.5 and 3.0ADAS-cog points, pooled SD fixed at 4.7) and the well powered effect size 1.85 using three different analysis methods (linear-mixed effects with categorical time, linear mixed effects with continuous time, and Student's t-test) in three different sample sizes (the planned sample size of 280 and two others: 240, 320). Simulations were based on resampling data from 769 participants with MCI from the ADCSdonepezil vs. vitamin E trial,3 a trial with inclusion criteria and an assessment schedule (i.e., 3, 6, 12 and 18 months) similar toTrial Construct 2. Accrual and dropout patterns from this trial informed the simulation studies.
Statistical power under each of the three scenarios and various effect and sample sizes was determined by simulations, with each simulation size N = 1000. Using the planned 280 participant sample size as an example, each simulation run resampled 280 participants from the donepezil vs. vitamin Etrial population. We used stratified sampling with replacement to retain the dropout pattern in the donepezil vs. vitamin E trial, drawing 75% of each sample from completers (the proportion of completers in the donepezil vs. vitamin E trial) and 7.5% from those who dropped before 18 months (the proportion of dropouts before 18 monthsin the donepezil vs. vitamin E trial). Sampled (and resampled) participants were randomly assigned 1:1 to the “active drug” or the “placebo” arm.Participants were then randomly assigned to one of fourpredetermined accrual/start dates in the proportions described forTrial Construct 2, so that at the time of the stay-at-home order80% had completed 3 months, 50% had completed 6 months, 25% had completed 12 months, and 12% had completed the planned 18-month endpoint. After this assignment, 24% were randomly selected to have “dropped out” onthe day of thestay-at-home order (i.e.,ADAS-Cog data beyond their presumed last assessment before the stay-at-home order were deleted). In this way we simulated a 24% dropout due to the pandemic,overlaid onthe naturally occurring dropout in the trial.
The treatment effect was constructed as follows: using the observed placebo arm, we first established that a mean difference of 1.85 points between armsin 18-month ADAS-Cog change scores(a standardized effect size of 0.395) would be detectable at 80% power with a sample size of 280 and 24% drop out rate, using a two-sided t-test at 95% significance level. We centered our simulated effect sizes around this value for the power studies, which were carried out for mean between-arm differences in 18-month ADAS-Cog change scores of 1.5, 2.0, 2.5, and 3.0 points. For all Scenarios, the effect of active treatment was modeled by addinga quadratictime trend (i.e., an increasing treatment effect) to the outcome data forthe “active drug”arm ineach simulation.
Each simulated data set was used to determine power to detecta significant difference at the 5% level (two-sided) between “active drug”and “placebo” arms using three methods: a) a t-test comparing the difference at 18 months, b) a linear mixed effects model with categorical time, testing for a difference at 18 months, and c) a linear mixed effects model with continuous time, testing for a difference in slopes. Models included fixed effects of baseline ADAS-Cog score, treatment group, and baseline ADAS-Cog score x treatment group interaction. Assessment visit (e.g., 3-month, 6-month, etc.) and the assessment visit xtreatment groupinteraction were also included.The categorical time model assumed a random intercept.The continuous time model assumed an unstructured covariance matrix.