The study was performed in accordance with the PRISMA guidelines for meta-epidemiological studies.22
We studied transcatheter and surgical aortic valve replacement for aortic stenosis because there were both high quality RCTs and a large number of non-randomized studies. Transcatheter aortic valve implantation is a relatively new technique, and its safety and efficacy is of current clinical interest.
We included all RCTs that randomly assigned patients to transcatheter or surgical aortic valve replacement and followed patients over time. We also included all comparative cohort studies that reported primary data on outcomes of interest after transcatheter or surgical aortic valve replacement.
We excluded non-randomized studies that were not comparative cohort studies, defined the population by excluding the outcome of interest, combined patients from RCTs and non-randomized studies, conference abstracts, poster presentations, non-peer reviewed publications, unpublished literature, systematic reviews that lacked primary data, and studies that used other surgical aortic valve replacement methods (e.g., minimally invasive, sutureless).
For multiple publications using the identical cohort we included the publication with the most representative sample, determined by sample size or duration of follow up.
We searched Medline, Medline In-Process/ePubs, Embase, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, Scopus, and Web of Science from inception to June 2017 (eTable 1). We used DistillerSR (Evidence Partners, Ottawa, Canada) to check for duplicate citations, and to screen titles, abstracts, and full text.
A single reviewer collected study characteristics, patient characteristics, and outcomes of interest; questions were resolved by consensus among the study team. Agreement of re-abstracted outcomes for a sample of 15 nonrandomized studies (17%) by a second reviewer demonstrated excellent inter-rater reliability (ICC 0.99 [95% CI, 0.98 to 0.99]).23
We collected study sample size, publication year and country, surgical approach, and the study time period. We collected surgical risk scores (e.g., EuroSCORE II) as a measure of potential selection bias among comparison groups.
We defined postoperative mortality as death due to any cause within 1-month or in hospital after the procedure regardless of location. We defined length of stay as the number of days the patient stayed in the hospital after the procedure. We extracted the necessary components of each outcome to calculate the pooled estimates of treatment effects. We calculated missing data points using given information where possible.
Explanatory variables: Study designs
We categorized studies into 8 groups according to study design: (1) All (all RCT and nonrandomized studies), (2) All RCT, (3) High quality RCT, (4) Low quality RCT, (5) All non-randomized studies, (6) Nonrandomized studies without adjustment, (7) Nonrandomized studies adjusted using propensity score matching (PSM), and (8) Nonrandomized studies adjusted using regression.
RCTs were divided into high or low quality RCTs based on the Cochrane Risk Of Bias (ROB) tool24 based on the content of the published articles; authors were not contacted for additional information (eTable 3). No RCT blinded study participants; hence RCTs that satisfied all other criteria were categorized as high quality. Non-randomized studies reported unadjusted estimates, adjusted estimates, or both. Non-randomized studies estimates were pooled into 3 groups: without adjustment, adjusted using PSM, and adjusted using regression.
Finally, we previously developed a set of 41 non-randomized studies attributes that could bias studies (Appendix B). These attributes were based on existing frameworks of bias and quality assessment tools for nonrandomized studies, and were extensively pilot tested and iteratively developed for clarity and reliability.
We compared overall study characteristics between RCTs and non-randomized studies using descriptive statistics. To combine continuous variables across studies, the weighted mean of estimates was calculated, and the pooled standard deviation (SD) was either calculated directly (where reported) or imputed from the pooled variance of included studies in the relevant group if missing 25.
Pooled estimates of treatment effects
The effect of treatment on postoperative mortality was estimated using odds ratio (OR). OR < 1 indicated lower risk of death for transcatheter aortic valve implantation. For Bayesian RCTs, we assumed the median estimate represented the percentage with events.26,27 The treatment effect on length of stay was estimated using mean difference (MD, with values < 0 indicating shorter length of stay for transcatheter aortic valve implantation).
All effect sizes were pooled using a random effects model to account for potential between-study heterogeneity. For postoperative mortality, we used the DerSimonian-Laird method,28 with the exception of estimates that incorporated adjusted ORs from nonrandomized studies adjusted using regression, which were calculated using the generic inverse variance method 25. For length of stay, we used the inverse variance method.25 All pooled estimates were presented visually using forest plots with point estimates and 95% CI. Estimates from high-quality RCTs were considered to represent the “gold standard” treatment effects.
We evaluated the impact of the 41 nonrandomized study attributes on estimates of treatment effect by calculating the ratio of odds ratios (ROR) for postoperative mortality and difference of mean differences (DMD) for length of stay with 95% CI using random effects meta regression. The ROR is the ratio of the OR in one group of studies and the OR in another group of studies18; the DMD is the difference between MD reported in one group of studies and the MD in another group of studies.29 We compared the pooled estimates between study categories, and also between nonrandomized studies with attributes hypothesized to be associated with bias. ROR < 1 and DMD < 0 indicated that studies with ‘better’ study characteristics favored transcatheter aortic valve implantation.
All statistical analyses were conducted using R studio version 1.0.136 (2016).30 The analysis of whether the attributes of nonrandomized studies were associated with statistical differences in pooled effect sizes was an exploratory analysis; a less restrictive 2-sided P value of 0.10 was used to determine potentially important attributes. In all other analyses a P value of 0.05 or less was considered statistically significant. P values for comparisons of estimates between types of study were those of the ROR or DMD for the comparison.