DOI: https://doi.org/10.21203/rs.2.12470/v1
Quality assessment of included studies is an integral component of rigorous systematic reviews as this step helps to ensure findings are externally valid. However, systematic reviews often include both randomized controlled trials (RCTs) and observational studies, and existing quality assessment tools are limited in the degree to which they allow for quality comparisons across these different study designs. Many existing quality tools tend to favor RCTs over other study designs, and while RCTs are among the most rigorous studies, observational studies often generate valuable evidence in situations where an RCT is not feasible. In this paper, we describe the development and validation of a novel Study Quality Assessment of Design (SQUAD) tool that can be used to assess the quality of multiple types of study designs within a systematic review.
The SQUAD tool was developed for a systematic review of studies examining the effects of clinician-patient interpersonal interventions on Quadruple Aim outcomes (i.e., population health, cost, patient and provider experience)19. First, we conducted a search for existing quality assessment tools using PubMed, Google, and systematic reviews in the literature. For each of the 6 tools that we identified,12,29,30,52,59,79 we examined the quality assessment criteria and the types of studies that could be assessed with the tool. Our search suggested that existing tools were unable to adequately handle diverse study designs. We determined that a tool combining the Cochrane tool for grading RCTs52 and the risk of bias criteria developed for controlled observational studies developed by the Effective Practice and Organisation of Care (EPOC) reviews29 could be used to compare quality across both RCTs and observational studies. Thus, both formed the foundation for the SQUAD tool, which incorporates elements of Cochrane and EPOC assessments, with simplified language and processes for ease of use. The domains of the SQUAD tool are listed in Table 1.
To test and refine the tool, we performed a multi-phase pilot with 12 studies drawn from the systematic review (8 RCTs and 4 observational studies).2,5,9,16,20-23,32,37,53,75 For each phase, 3-4 studies were assessed by two raters (AT and SB). Ratings for the studies were reviewed by a third team member (MH), and discrepancies were resolved by consensus. The tool was revised after each of these meetings before additional studies were reviewed. Most differences between the two raters were due to missing or misplaced information in the text; discrepancies rarely arose due to different interpretations of the descriptions of the domains. We examined inter-rater reliability after each iteration by calculating the model 3 intra-class correlation coefficient for raters.
After refining the SQUAD tool, we tested it in an evaluation of all 77 studies for the systematic review1-11,13-18, 20-28,31-51,53-58,60-78,80-84 that included 68 RCTs and 9 observational studies. Four reviewers randomly rated different studies using the Covidence systematic review online interface (Covidence, Vertitas Health Innovation Ltd, Melbourne, 2018), until there were two sets of ratings for each study. We examined inter-rater reliability for the quality assessments by calculating the model 2 intra-class correlation coefficient. Statistical analysis was conducted in R (RStudio Inc., Boston, Version 1.1.383, 2017).
The SQUAD tool is available in the online Appendix. During the pilot with 12 studies2,5,9,16,20-23,32,37,53,75 (8 RCTs and 4 observational studies), the ICC (1,1) improved with each iteration: 0.41, 0.64, 0.69, 0.89. During the assessment of 77 studies in the systematic review (68 RCTs and 9 observational studies), the ICC (2,1) was 0.72. ICCs are presented in Table 2.
The SQUAD tool offers a pragmatic approach to quality assessment for researchers evaluating RCTs and observational studies in systematic reviews. It is unique in that it can handle both types of studies without being biased towards a single study design, while maintaining the rigor necessary to help ensure external validity. It is also simple and accessible to researchers with varying levels of experience. A limitation of the design process is that it was tested in a single systematic review of RCTs and observational studies of clinician-patient interpersonal interventions. Additional testing is warranted to validate the tool across more study designs covering diverse content. However, our goal at this stage is to present a novel tool for use in systematic reviews of studies with a mixture of designs. The authors acknowledge that the tool could use further refinement, and is an avenue for future research.
SQUAD is a practical and reliable tool for assessing the quality of randomized trials and observational studies when synthesizing findings for systematic reviews. This tool will help ensure that results from systematic reviews maintain external validity when incorporating multiple study designs.
ICC = intraclass correlation coefficient
RCT= randomized control trial
Ethics approval and consent to participate: Not applicable
Consent for publication: Not applicable
Availability of data and material: The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Competing interests: The authors declare that they have no competing interests.
Funding: This study was supported by a grant from the Gordon and Betty Moore Foundation (#6382).
Authors’ contributions: AT performed the search for existing tools and helped create the SQUAD domains, conducted quality assessment ratings, analyzed and interpreted data regarding all statistical analyses performed, and drafted the manuscript. MH led the systematic review, oversaw the creation of SQUAD and its domains, and was an editor of the manuscript. SB conducted quality assessment ratings and was an editor of the manuscript. DZ was the primary mentor for the systematic review and SQUAD tool development, and was an editor of the manuscript.
Acknowledgments: The authors would like to acknowledge Derek Boothroyd for his guidance in data analysis, and Gabriella Piccininni and Laura Jacobson for their contributions towards the quality assessments for the systematic review.
Disclaimer: The contents of this article do not represent the views of VA or the United States Government.
Table 1. SQUAD Tool Domains |
|
Randomization* |
Randomization to experimental and control groups |
Protection against selection bias* |
Study incorporates methods to protect against selection bias during study recruitment, assignment, and identification |
Blinding |
Methods used to ensure that study participants (e.g, physicians and patients) are unaware of study objectives |
Protection against contamination |
Reasonable and successful measures were taken to prevent the control group from being exposed to the intervention |
Baseline measurement |
Baseline data was assessed for all groups before the intervention was administered to any group |
Inclusion of outcomes |
Outcome data is present for all participants in a study for each main outcome |
Exclusions of findings |
Absence of selective outcome reporting (all outcomes are reported) |
Acknowledgment of contradictions |
Outcomes and reported results are consistent, or authors identify reasons for discrepancies between outcomes and reported results |
Protection against detection bias |
Outcomes are measured objectively or methods are used to prevent/minimize bias |
Reliable primary outcome measure(s) |
Outcomes are objectively measured or have high inter-rater reliability |
Other sources of bias* |
Additional concerns about bias not addressed in other domains in the tool |
*domains not applicable to every study. Each domain is given a score of 1 to 3 based on the level of adherence to the principle measured by it. An average of scores per study is calculated to determine overall study score. |
Table 2. Analysis of inter-rater reliability for all trials of SQUAD (alpha=0.05) |
||||||||
Trial |
Type |
ICC |
F |
df1 |
df2 |
p |
95% CI Lower Bound |
95% CI Upper Bound |
Pilot 1 |
ICC (1,1) |
0.41 |
2.4 |
32 |
33 |
0.0082 |
0.080 |
0.65 |
Pilot 2 |
ICC (1,1) |
0.64 |
4.5 |
38 |
38 |
>0.001 |
0.41 |
0.79 |
Pilot 3 |
ICC (1,1) |
0.69 |
5.4 |
32 |
33 |
>0.001 |
0.46 |
0.83 |
Pilot 4 |
ICC (1,1) |
0.89 |
17 |
30 |
30 |
>0.001 |
0.78 |
0.94 |
Final |
ICC (2,1) |
0.72 |
6 |
844 |
845 |
>0.001 |
0.68 |
0.75 |