To the best of our knowledge, this is the first study to identify the methodological approaches for assessing the certainty of evidence in URs of that included SR-MAs. Overall, 138 URs have been included consisting of 96 and 42 URs of interventions and non-interventions, respectively. Only one-third of URs of interventions assessed certainty of evidence, in which the GRADE approach was mainly used. URs published in high JIFs are more likely to assess the certainty of evidence than URs published in low JIFs. About two-third of URs of non-interventions assessed certainty of evidence, in which criteria for credibility was mainly used. Nearly 90% of the URs performed a methodological quality assessment and AMSTAR was the most frequently used tool for this process.
The certainty of the evidence is the extent of confidence to support a decision or recommendation. High certainty in evidence means that the investigators are very confident that the effect they found across studies is close to the true effect and vice versa [147]. Concerning the benefits and harms of a treatment or intervention, the assessment of the certainty of the evidence is essential [148]. Moreover, the certainty of the evidence can be used to develop clinical practice guidelines and recommendations. Again, epidemiological investigations can help establish evidence linking exposure to the incidence of certain health condition in a population. These studies are expected to play a key role in gauging the burden of diseases, delineating guidelines for prevention as well as streamlining the treatment development process. URs of both the interventional and observational studies should aim to provide the highest certainty of evidence to facilitate better health outcomes. Despite the necessity of assessing the certainty of the evidence in URs, there is no consensus that which approach should be the method of choice.
Compared to the results from a previous study by Hartling et al [1] indicating that only 16% of the overview of reviews published between 2000 and 2011 assessed the certainty of the evidence, our study found that only one-third of included URs of interventional studies assessed the certainty of the evidence. Aligned with the previous study [1], the most frequently used method for assessing the certainty of the evidence in the URs was the GRADE approach. One of the reasons could be that the GRADE approach is a well-established tool developed to determine the certainty of evidence-based on several factors namely risk of bias, imprecision, indirectness, inconsistency, and publication bias [147]. However, this tool was primarily designed for assessing the quality of the evidence from primary studies. Thus, further guidance is needed to ensure appropriate use and interpretation of the GRADE tool when it is applied to assess the quality of evidence of SRs, instead of primary studies [1].
Furthermore, this study demonstrated that several methodological approaches for assessing the certainty of evidence were used in the URs. We found that the criteria for credibility assessment, which was recently released [10, 148], was also the second most frequently used method in URs of interventional studies. In contrast, approximately all of the URs of observational studies in our review utilized these epidemiological credibility assessment criteria. The reason that our study differs from the previous study [1, 3] likely because we specifically considered the URs that included MAs. The criteria for credibility assessment classified the certainty of the evidence according to several statistical criteria, which usually reported in MA. However, this approach using the arbitrary cut-off values and the cut point of each component in these criteria varied among previously published URs, reflecting the need for guidance. Although Aromataris et al.—a methodology working group formed by the JBI (formerly named the URs Working Group)—published the guidance on how to conduct and report an UR [10], the methodology for the certainty assessment was not provided.
This study demonstrated that a higher number of URs with a certainty assessment were published in higher impact journals and the more recent URs tended to assess the certainty of the evidence. One of the reasons could be that the assessment helped to reflect the certainty of results and facilitate the translation of the evidence into guideline recommendations. Therefore, our findings highlighted the importance of guidance for assessing the certainty of the evidence in URs to recommend the most appropriate tools to provide standards for those conducting URs.
This study also demonstrated that majority of the included URs performed a methodological quality assessment. This was more frequent than a previous study [1] that reported the assessment of methodological quality in only 37% of the overviews of reviews. One of the reasons could be that this process has been strongly recommended in the methodological guidance for producing URs [2] and has been implemented longer than the certainty assessment. This process is essential to ensure that the methodological quality of SR-MAs that included in URs are adequately assessed and incorporated into the results and conclusions. Besides, we found that the most often used tool for methodological quality assessment changed from the Oxman and Guyatt Overview Quality Assessment Questionnaire (OQAQ) to AMSTAR. The AMSTAR tool has been recommended since 2007 and the revised version-AMSTAR 2 was released in 2016. Given that the revised tool introduced recently, the method advocated in published guidance have evolved over time and the variation of tool used for methodological quality assessment reported in this study confirms the need for updated guidance for conducting URs. Furthermore, researchers of URs should incorporate the certainty of evidence and methodological quality assessment and report the results in their URs, which could in turn enable a translation to guideline recommendations or the researchers should otherwise present valid reasons for not assessing it.
Our study has some limitations. First, the definition of included studies was restricted to URs. This might not cover all types of other kinds of reviews for example- overview of reviews, and review of reviews. Therefore, our findings with regards to terminology used to describe “umbrella reviews” and methods used might not be comprehensive or wholly representative. However, there is no universally accepted technical term for this new type of reviews that summarize or synthesize findings from systematic reviews. The term URs has been used increasingly and studies that describe the methodological approach regarding the URs are sparse to date. Second, our study focused on describing the method used in previously published URs and most of them did not provide the reasons for methods selection. Thus, we could not assess the reasons why each UR used different approaches for assessing the certainty of evidence and methodological quality. However, a major strength of our study is that it provides a broad picture of the certainty assessment methods used in URs of both interventional and observational studies. Clearly, authors of URs of observational studies have a preference to use the criteria for credibility assessment tool, while for the URs of experimental studies, the GRADE tool is mostly favored. This highlights an unmet need of a suitable tool to be used in URs of experimental studies. Nevertheless, the reviews of methods used for assessing the certainty of evidence and methodological quality of URs that contained other study designs could be extended in future research.