Study Design: A systematic review was conducted for published meta-analysis studies that met our inclusion criteria (see below), with all reviews of database updated and finalized by October 31st, 2019. The procedures and outcomes are reported according to the PRISMA (Preferred Reporting Items for Systematic review and Meta-Analysis) statement (13)(Additional File 1).
Data Sources: A comprehensive literature search was conducted across the following databases: PubMed/Medline; Embase/Elsevier; EBSCOhost, and Web of Science. A combination of MeSH (Medical Subject heading), EMTREE, and free-text terms, and any boolean operators and variants of terms, as appropriate to the databases, were used to identify eligible publications (Additional File 2). Each search included “systematic” or “meta-analysis” in the title or abstract along with one or more of the following terms for the sample’s age (child, preschool, school, student, youth, and adolescent) and one of the following terms to be identified as a topic area related to childhood obesity (obesity, overweight, physical activity, diet, nutrition, sedentary, screen, diet, fitness, or sports). Two authors (MB and AO) screened and reviewed all articles for inclusion. We restricted our search to meta-analyses published since January 1st, 2016.
Inclusion/Exclusion Criteria: All meta-analyses were screened for inclusion based upon the following criteria: Reported on a child obesity-related topic (see above), included behavioral interventions, presented studies summarized via meta-analytical procedures, and included at least 2 or more studies. Exclusion criteria included: mechanistic feeding or exercise training studies, institutionalized sample, conference presentation/abstract with no full-length publication, special populations with known sample size limitations (e.g., Cerebral Palsy, Childhood Cancer, neuromotor delays, massively obese, disorder sleep), reported clinical outcomes solely (e.g., glucose, blood pressure), or reported only post-hoc comparisons or confounder/effect modifier comparisons. Once all meta-analyses were identified, the reference lists were reviewed and all included studies in the meta-analyses retrieved.
Data Extraction: Data for each meta-analysis article were extracted from the summary effects presented in forest plots. Extracted data from the forest plots included the sample size for each individual study (where presented), point estimates, and 95% confidence intervals or standard error, where reported. Where sample sizes of the individual studies were not reported, this was extracted from the original publication. The metric of the effects presented (Hedges’ g, mean difference in the units of the outcome, or standardized mean difference) was recorded. All summary effects represented in forest plots within each meta-analysis article were translated into standardized mean differences (SMD) for analytical purposes. All effect sizes were corrected for differences in the direction of the scales so that positive effect sizes corresponded to improvements in the intervention group, independent of the original scale’s direction. This correction was performed for simplicity of interpretive purposes so that all effect sizes were presented in the same direction and summarized within and across studies. Two authors (MB and AO) extracted and verified all data from included articles.
Classifying Studies based on Sample Size and Pilot/Feasibility Status: For our primary analyses, individual studies from the meta-analyses were a priori classified into four categories based upon either sample size or pilot/feasibility designation (i.e., self-identified pilot/feasibility, N ≤ 100, N > 100, and N > 370). The first classification was for pilot/feasibility studies which were defined as those studies that self-identified in the title, abstract, or main body of the publication as being a pilot, feasibility, exploratory, evidentiary, or proof of concept trial. Pilot/feasibility trials were coded separately from the other trials irrespective of the included sample size. For studies that did not self-identify as a pilot/feasibility trial, the following three sample size classifications were made. We classified studies according to previously published sample size categories which defined smaller trials as including 100 or fewer total participants (N ≤ 100 studies) (6). We classified the remaining, non-pilot/feasibility trials as N > 100 (i.e., excludes both pilot/feasibility and N ≤ 100 studies). As a secondary classification among trials with N > 100, we separated those trials with the largest 25% of sample sizes according to the distribution of sample sizes presented in studies included in the meta-analyses. For this sample of studies, this corresponded to N > 370, and served as our fourth study classification.
Prevalence of Smaller and Pilot/Feasibility Trials in Meta-Analyses: We explored the prevalence and overlap of trials using network visualization analyses tools. All individual articles (i.e., edges) included in the identified meta-analyses were coded based upon sample size classifications, the origin meta-analysis publication (i.e., node) and entered into Gephi (v.0.9.2) (14). We examined the number of unique articles included across meta-analyses by trial sample size and pilot/feasibility classifications, and examined the potential overlap of individual trials included across multiple meta-analyses (whether individual studies are included in more than one meta-analysis).
Data Analyses
Influence of Trial Classification on Summary Effects from Meta-Analyses: For the purpose of this study, we defined the vibration of effects (VoE) as the difference between the originally reported summary meta-analytical SMD and the re-estimated summary SMD restricting analyses to trials based upon the classifications as defined above (15). We represent the VoE using the following two metrics: 1) the absolute difference in the estimated ES between the originally reported summary meta-analytical SMD and the re-estimated summary SMD; and 2) the percent difference in the estimated ES between the originally reported summary meta-analytical SMD and the re-estimated summary SMD by dividing the absolute difference by the average of the originally reported and re-estimated summary meta-analytical SMD.
We also examined the influence of the proportion of studies of a given classification within a summary SMD on the VoE as defined as the absolute difference in the SMD using meta-regression. The absolute difference was used to determine the overall deviation (i.e., vibration), regardless of directionality, from the originally reported based upon the re-estimations as the dependent variable and the proportion of studies based on quintiles (i.e., 0–20%, 20–40%, 40–60%, 60–80%, 80–100% of studies) and entered into the model as dummy coded independent variables for analyses restricted to N > 100 and N > 370. In models restricted to pilot/feasibility and N ≤ 100 studies, too few (n = 3) summary ES were comprised of 80–100% of studies of these classifications. Therefore, a group comprised of 60–100% was created and entered into the models along with 0–20%, 20–40%, and 40–60%. Separate models were run for each of the four study classifications. Each model controlled for the number of articles included in a meta-analysis to account for differences in the total number of studies represented across meta-analyses. Models were estimated using the meta commands in Stata (v.16.1, College Station, TX).
Additionally, we compared the nominal level of statistical significance by comparing the agreement (i.e., level of concordance) between the level of nominal statistical significance (P ≤ 0.05 vs. >0.05) from the re-estimated summary effects and the originally reported meta-analytical effects. Summary effects were classified as either significant or non-significant and the level of concordance evaluated using kappa coefficient. Finally, the association between study classification and precision, defined as 1/SE, was also investigated by comparing the precision of studies based upon the four classifications and the decile of precision. Summary ES were estimated across deciles of precision.
Case Study Examples: To illustrate the influence of VoE resulting from including smaller or pilot/feasibility studies in meta-analyses, we identified three US Preventive Services Task Force and the Community Preventive Services Task Force websites for recommendations that met the following criteria: targeted youth (≤ 18yrs), focused on a topic area related to childhood obesity (obesity, overweight, physical activity, diet, nutrition, sedentary, screen, diet, fitness), were based upon a meta-analysis and the data in the article were presented in a forest plot to extract the computed SMD and measure of variance for each individual study. For identified meta-analyses, we retrieved the publications of the studies included and categorized them according to our four classifications: self-identified pilot/feasibility, N ≤ 100, N > 100, and N > 370. Using the procedures outlined above, we compared the re-estimated summary SMD vs. the originally reported summary SMD in the identified meta-analyses to identify differences in the estimates and conclusions considering trials of differing size.
All data were entered into Comprehensive Meta-Analysis (v.3.3.07, Englewood, NJ) to calculate the standardized mean difference effect sizes for each reported outcome across all studies. The summary SMD were computed for the originally reported summary SMD and for all comparisons based on study classifications. All analyses were made at the summary SMD level and computed using both fixed-effects estimates and the DerSimonian–Laird random-effects estimates (16). For instances where the originally reported summary SMD was comprised of studies all the same classification, no comparisons were made (e.g., 9 studies in summary SMD and all classified as N ≤ 100).