The increase in the use of longitudinally collected patient-reported outcomes (PROs) to evaluate treatment risks and benefits in cancer randomized controlled trials (RCTs) has put emphasis on the need to evaluate appropriate approaches to support a trial design and thorough statistical analyses. In the context of clinical trials where the exposure is time-invariant, the aim of study involving repeated measurements is typically the change in response of treatment and control. Although several approaches to data analysis of longitudinal RCT’s have been proposed, there is not a gold standard. The Setting International Standards in Analyzing Patient-Reported Outcomes and Quality of Life (SISAQOL) Endpoints Data Consortium provides recommendation on several aspects of PROs analysis, from how to develop a taxonomy of research objectives, to how to handle missing values (1). The proposed statistical methods for answering the broad selection of research questions within RCTs vary from t-tests, Cox regression, Analysis of Variance (ANOVA), Analysis of Covariance (ANCOVA), and mixed models with further considerations of adjustments, clusters, and interactions (1). Although many approaches to handle data analysis for longitudinal data in clinical trials exist, power and sample size methods are available only for a limited class of these models. Moreover, even if the use of longitudinal data often provides increased statistical power towards examining causal effects and treatment differences, the complexity of these designs along with the underlying correlation structure can impact the application of the planned statistical analysis. In addition to the challenging task of identifying the method that may ‘best’ answer specific research questions, we face with the apparent need of ensuring, within complex data structures, the appropriate sample size through statistical power estimation to meaningfully evaluate the analysis results.
For the last three decades, several method publications have been emphasizing the need of sophisticated techniques to power studies involving longitudinal data, underlying the possible effect of within-subject correlations, repeated measurements, and missing data on the effect-size estimation. Rochon (2) adapted the Liang and Zeger’s approach of performing sample size calculation for repeated measure experiment, using the generalized estimating equations (GEE) model with a model-based covariance, while Muller focused their analysis on general linear multivariate models (3). In their extensive work on sample size estimation for longitudinal designs with attrition, Hedeker, Gibbsons and Waternaux (4) derived several formulas under assumptions of compound symmetry, first-order autoregressive, and non-stationary random-effects structure, applicable to a wide variety of models. Tu developed an alternative approach to GEE and mixed-effect models, which derives the power function on the asymptotic distribution of the model estimate. This method attempts to improve the limitations for clustered approaches, however, it requires a strong assumption of a common constant cluster size across clusters (5). Many authors recommended sensitivity analysis to assess sample size requirements through variation of the non-centrality parameters (3, 4) and others recommended inflation of the resulting sample size to protect from deviations in the assumptions (2). More recent work focuses on implementing the formulation for sample size calculation to compare difference among groups over time using GEE method that account for missing patterns, correlation structures and unbalance designs (6),(7), and extend the approach to multivariate analysis (8). Although these techniques aid in power calculations for the planned analyses, the validity of their results are based on strong assumptions of the parameter estimates and covariance structure. Often, the failure to meet these assumptions results in non-estimable model parameters forcing the use of alternate modeling techniques. Furthermore, all proposed approaches seem to require a rather intense effort in terms of evaluation and coding, while model convergence may not be achieved anyway. Since the consensus around the best approach remains open (1) a question arises, could we simplistically reduce the power calculation process to a less computationally intensive approach, without compromising statistical integrity? In this paper we provide a thorough analysis of six common approaches to longitudinal data for binomial outcomes to verify whether a straightforward and more direct approach to power analysis is plausible. We use simulated data to illustrate differences, similarities, and feasibility of the following six techniques under selected parameter combinations: 1) Generalized linear model (GLM) with generalized estimating equations (GEE); 2) Generalized linear mixed model (GLMM); 3) Logistic regression; 4) Cochran-Mantel-Haenszel (CMH) 5) Chi-square, and 6) Fisher’s Exact test.