We firstly introduce the Trikalinos dataset to be used through the article. Then we describe our methods to identify characteristics associated with BoS, and for developing a multivariable prediction model for BoS.
Data analysis using 43 reviews of binary outcomes identified by Trikalinos
There were 45 reviews included in the empirical review by Trikalinos et al [9, 10]. Each review contained at least seven studies that reported both outcomes or at least half the studies with both outcomes if the total number of studies was greater than 14. Each of the studies satisfying the previous requirement (i.e. at least seven studies with both outcomes) had at least 10 patients and at least two events. There were two reviews that contained three outcomes, but these were excluded for our purposes since we decided to focus on bivariate models for two outcomes. The remaining 43 reviews were included, and these contained up to 132 studies with two binary outcomes of interest each with a cross classification (two by two) table summarising the number of outcome events and non-events for the treatment and control groups. The relationships between the pair of binary outcomes was either mutually exclusive or an is-subset-of relationship. An is-subset-of relationships refers to when one outcome is contained within the other. For example, the number of patients that have survived with a particular condition at, say, 6 months and a year. A mutually exclusive relationship is when the outcomes are independent of each other and therefore occur separately. An example is death from breast cancer and death from other causes, excluding breast cancer.
For each of the 43 meta-analysis datasets, we used the two by two tables for each outcome in each trial to derive treatment effect estimates (log odds ratio estimates) and corresponding error variances. A fixed 0.5 continuity correction was required if any denominator in the equation for the variance was equal to zero [16, 17], that is, if a study had a zero cell in the two by two table then 0.5 was added to all cells for that study. This is a similar approach to the normal approximation analyses in the in the original Trikalinos review [9, 10]. For a pair of treatment effect estimates in the same trial, we also derived their within-study correlation using the formula provided for an is-subset-of relationship [7, 18], and by Trikalinos & Olkin for a mutually exclusive relationship . In some studies, the within-study correlation was +1 or -1, which can cause issues of singular variance matrices during the multivariate model estimation. To avoid this issue, we replaced any ±1 values with ±0.99., although other approaches are possible .
To each of the 43 meta-analysis datasets, a univariate common-effect meta-analysis was applied to each outcome separately, using maximum likelihood (ML) estimation. Then we also fitted a bivariate common-effect meta-analysis using ML estimation, to jointly analyse both outcomes whilst accounting for any within-study correlations. The ordering of outcome 1 or 2 was irrelevant (i.e. same results obtained regardless), though for the is-subset-of reviews outcome 2 was designated to be the subset of outcome 1.
Following the bivariate analysis, the BoS was quantified for each outcome by calculating the BoS statistic proposed by Jackson et al.:
The BoS statistic provides the percentage reduction in the variance of a particular summary result that is due to (borrowed from) data from other correlated outcomes. It is the percentage weight toward the summary result for, say, outcome 1 that is given to the study data for other correlated outcomes . For example, in a bivariate meta-analysis, a BoS of 0% for outcome 1 indicates that the summary result for outcome 1 is based only on data for outcome 1, whereas a BoS of 100% indicates that it is based entirely on the correlated data from outcome 2. The distribution of BoS statistic values was summarised using descriptive statistics and graphically via histograms.
The process was repeated rather using univariate and bivariate random-effects models, which allow for between-study heterogeneity. Similar conclusions were drawn and so we focus on the results from the common-effect meta-analyses in this paper. Further, some of the bivariate random-effects models suffered from problems estimating the between-study correlation (often ‘converged’ at -1 or +1, for reasons explained elsewhere ), and so we deemed it more reliable to focus on BoS observed for the bivariate common-effect model.
Examining characteristics associated with BoS
The following seven meta-analysis level characteristics were selected for examination of their association with BoS statistic values from a bivariate common-effect meta-analysis:
- the percentage of studies with missing data for the outcome of interest
- the percentage of studies with missing data across both outcomes
- the number of studies in the meta-analysis
- the number of studies with only the outcome of interest
- the number of studies with both outcomes
- the average absolute within-study correlation
- the largest absolute within-study correlation
These characteristics were identified by the research team based on analytic reasoning (see Supplementary material), and our previous (applied and methodological) experience [3, 12, 15, 21]. The unadjusted effect of each characteristic on the magnitude of BoS was estimated by fitting a linear regression with BoS as the outcome and the characteristic as the only covariate. Two BoS values were available for each of the 43 reviews (one for each outcome), and so the dataset had 86 outcome values in total. A random intercept was used to account for clustering of BoS values from the same study. We also considered modelling BoS on the log scale, but this did not change the findings importantly, and therefore we present results on the BoS scale to aid interpretation.
Development and internal validation of a prediction model for BoS
A multivariable prediction model was developed for predicting BoS in a new bivariate meta-analysis dataset. The 7 characteristics previously listed were candidate predictors for inclusion. As there were 86 BoS values for the modelling, this corresponded to 12.3 values per candidate predictor. At the time of model development, this was considered appropriate as it was larger than ten subjects per predictor (often a rule of thumb for sample size), larger than a recent proposal of two values per predictor , and ensured a multiplicative margin of error less than 20% for the residual standard deviation (i.e. lower and upper bounds of 95% confidence for residual variance within 20% of the estimated value) [23,24].
A multivariable linear regression model containing all the seven candidate predictors (forcing them all to be included, regardless of statistical significance) was fitted. The apparent model performance was quantified by the apparent R2 statistic. Internal validation was then undertaken to obtain optimism-adjusted estimates of R2 and calibration slope, using bootstrap resampling with 1000 bootstrap samples, as described elsewhere [25-27]. The optimism-adjusted calibration slope was then used as a uniform shrinkage factor; that is, we multiplied the predictor effects of the fitted model by the optimism-adjusted calibration slope. Then, forcing the revised predictor effects to be held fixed, we re-estimated the model intercept to ensure calibration-in-the-large. This produced our final model with all predictors.
In addition to fitting full models, a backwards selection procedure was undertaken to identify a simpler model, with p-values less than 0.1 used to define predictor inclusion. Internal validation and optimism-adjustment was again applied using bootstrapping, which accounted for the variable selection when estimating optimism.
Applications in new data
For illustration of their potential use, we applied the developed tools to predict BoS in two Cochrane reviews not included in the Trikalinos review, and to three non-Cochrane reviews, with comparison to subsequent multivariate meta-analysis results and observed BoS values.