To the best of our knowledge, this is the first international survey assessing current practice of methods for diagnostic test RRs. With 25 participants from across all continents, we managed to generate a broad international sample. We obtained additional information about current strategies in use for development of these RRs in order to complement the findings from our previous scoping review (20).
Briefly, the general methods involved for the development of RR can be broadly classified into two groups: those limiting the scope and/or affecting the rigor of RR development; and those increasing the resources available for RR development (5, 6, 24, 25). In the first group, we found that most strategies to narrow the scope are not used as a standard method; however, our survey indicated a greater usage of limits in the scope than our previous scoping review suggested (20). In addition, we found a high number of participants imposing limits on the search, e.g., by limiting the language or date of electronic searches. We noticed that more than half of respondents use methodological filters during the literature search indicating that respondents are willing to potentially miss some studies in order to retrieve a manageable number of search results given the project’s shortened timeframe (17, 18). We also confirmed that the participants often use a narrative synthesis to describe their findings rather than a formal data meta-analysis (20). Participants also reported that they consider the inclusion of previous evidence synthesis to be useful for streamlining. While the use of pre-existing reviews was one strategy proposed for development of RRs, it is important to note that this strategy depends on the availability of existing SRs that satisfy the updated standards of preferred reporting items (26–28). These may not exist for all topics.
Regarding the resources available for RR development, we found that a considerable number of institutions involved trained staff in the development of diagnostic RRs although there were usually only two reviewers involved. Selection and data abstraction by a single reviewer were common; however, it is possible that some institutions may prefer to perform selective verification by a second reviewer on a sample of the total citations. This strategy was suggested by the survey´s participants. We found that roughly one-third of institutions involved stakeholders in the development of RRs. While most standard SRs do not involve stakeholders in their production, RRs might be more relevant for decision-making in certain situations (29–31). We further found limited use of task parallelization perhaps due to the lack of studies about the usability and impact of these strategies both in general and for diagnostic test RRs in particular (32–34).
Previous surveys of producers of knowledge syntheses reported slightly higher levels of adoption of RR methods compared to our findings (35). These levels of adoption are also higher than those found in our previous scoping review. It is possible that RRs using methods similar to those used in SRs have a greater chance of being published (24, 36). While we found that few RR methods are used by more than 90% of participants, we also observed that some SR tasks—such as developing a protocol and performing peer-review—are commonly implemented despite the time required for implementation. One possible explanation for this is that the extent of methodological modifications relies on a request from different stakeholders and therefore, in some cases, RRs can be produced following many of the same methods used in standard SRs (20, 37). In addition, it is known that decision-makers are willing to accept only a small risk for an inaccurate answer in exchange for a rapid product; thus, current RR developers would be reluctant to compromise the validity of results in exchange for implementation of methodological shortcuts and limits (37, 38).
We acknowledge several potential limitations in this study. We obtained a 53% response rate from invited institutions based around the world; non-responders were located mainly in Europe and Latin America. Although we obtained replies from institutions based in similar locations, these missing data could have generated a risk of selection bias in our findings. We also found that 13 out of 39 institutions replying to our invitation do not conduct RRs and/or RRs of diagnostic tests. Also, participants in our survey were mainly representatives of local, national, and regional HTA agencies. The reviews performed by these institutions might have characteristics that differ from other reviews produced in academic and research settings.