We examined the impact of five alternative approaches to longitudinal missing data in an empirical example of RA treatment strategy comparison, in which we used data from a prospective observational study to form an external control group for a clinical trial. We consistently found superior outcomes of the "stringent treat-to-target" strategy (based on the trial data) compared with the "pragmatic treat-to-target" strategy (based on the observational study data), although the difference was only statistically significant at the 12-month visit. The complete follow-up case analysis tended to give higher effect estimates of the OR and wider confidence intervals compared with the other more sophisticated approaches; however, this difference decreased with longer follow-up. The four methods employing IPCW, MI, and their combinations, generally yielded similar OR estimates despite their differing modeling assumptions.
Although the literature on missing data is vast [26], our paper is unique in its focus on longitudinal missing data challenges in the emerging area of using real-world data as external controls for trial data [4]. The similarity of the patient populations and applied treatment strategies in the two studies providing data for this methodological investigation enabled us to assess missing data patterns resulting from follow-up under different study designs. We found a larger amount of missing data with more complex missing patterns in the observational study compared with the limited and monotone missing data in the trial. As a result, the differences across the alternative approaches to longitudinal missing data mainly came from the estimated disease remission proportions in the "pragmatic treat-to-target" strategy arm (based on the observational study data). Most notably, the complete follow-up case analyses gave smaller estimates for the proportions of patients reaching the desired remission outcome at 6 and 12 months in the "pragmatic treat-to-target" strategy arm. This made the estimated benefits associated with the "stringent treat-to-target" strategy (based on the trial data) appear better.
Both IPCW and MI can provide unbiased estimates under the missing at random assumption, which is weaker than the assumption for the complete follow-up case analysis. An advantage of MI is that this method efficiently uses information from individuals with partially missing data [21, 22, 24]. All available and relevant data can be included in the imputation model, including both variables related to the outcome analyses and variables associated with missingness [22, 24]. However, the MI approach is potentially sensitive to misspecification in situations where some individuals have large blocks of missing values [21]. Thus, missing data due to drop-out in the present study may make MI less appealing, especially for the 24- month time point. IPCW assumes a correctly specified model for the missingness mechanism, given observed data at previous time points [21, 27]. A correctly specified IPCW can account for missingness due to blocks of drop-out. However, IPCW can be less efficient due to the loss of information from incomplete cases [21]. Thus, the IPCW model for the trial data, with smaller amounts of missing at 6 and 12 months and a maximum of 10.1% missing at 24 months, was likely to be more efficient than the IPCW model for the observational data, with a substantial amount of missing outcome data during follow-up.
In the present empirical evaluation, censoring all patients at first missing data (strict censoring) created a monotone missing pattern in the observational data, while a monotone missing pattern already existed naturally in the trial data. Despite using IPCW to account for created or naturally occurring drop-out, the estimates from the strict censoring approach were less efficient at 12 and 24 months than approaches involving MI, reflecting the substantial loss of information due to excluded data points. This may indicate increased efficiency due to recovered information when using MI to impute all or partial missing visit data and may be preferable compared with excluding individuals at first missing value.
A limitation of this methodological investigation is the generalizability of results to other settings using an external control group. Data in the external control group of the present study was provided by a contemporaneous, prospective observational study with a patient population and follow-up strategy similar to the trial [12–14]. This is the most favorable type of external control group [4]. As a result, emulating a target trial was relatively straightforward. Other sources of observational data, such as electronic health records and insurance claims, likely pose more methodological challenges. Furthermore, as we used empirical data rather than simulations, we do not know the true underlying effect of the "stringent treat-to-target" compared to "pragmatic treat-to-target".
In conclusion, we empirically examined the impact of different approaches for missing follow-up data when using data from an observational study to form an external control arm for a clinical trial. Despite the favorable setting of having prospectively collected observational data, there were some differences in the effect estimates although the clinical conclusion was not affected. The differences mainly came from the handling of more extensive and complex missing data in the observational part of the study. When using routine observational data as external controls even more complex missingness issues are likely expected. As the quality of a comparative effectiveness study is dependent on what we compare to, we cannot overemphasize the importance of carefully examining missing data patterns and conducting appropriate sensitivity analyses in this setting.