Rapid evidence synthesis
Whilst evidence synthesis can represent the strongest evidence base for innovations, conventional systematic reviews may often take up to two years to produce,(1, 2) while even rapid reviews have a timeframe which may range up to a year.(3) Evidence summaries or evidence briefings are a form of rapid evidence synthesis which are usually produced on a shorter timescale driven by decision makers’ needs,(4) and have been found useful in informing decision making, including by sub-national healthcare administrations.(5) Rapid evidence synthesis has been used to inform the commissioning of research,(6) services,(7) and to inform policy making.(8) However, rapid evidence synthesis has not to date been widely used in decision making around innovation adoption, where timescales are shorter.
Background and context
Challenges in getting proven innovations rapidly adopted into systems, policies or practice have long been recognised. In the UK, the Accelerated Access Review provided fresh policy impetus to efforts to develop a faster pathway to identify and adopt high value innovations in the National Health Service (NHS) in England.(9) This set out recommendations to improve efficiency and outcomes for NHS patients by increasing the speed of access to innovative healthcare methods and technologies, including digital products. A key part of this has been to develop infrastructure like the Academic Health Science Networks (AHSNs) and the NHS Accelerated Access Collaborative (AAC) to support innovation in health and care. AHSNs are the agencies charged with supporting the introduction and diffusion of innovative products across the NHS.
In Greater Manchester, Health Innovation Manchester (HInM) is the AHSN with the remit to implement health innovations across the region. Our definition of innovation includes any technology, device, procedure, set of behaviours, routine, or way of working that is new to the Greater Manchester context.(10) Decisions about innovation adoption in Greater Manchester may need to be made rapidly, and for some novel technologies, the available evidence may be limited. It is important that decision making about healthcare innovation should be informed by evidence, and that there should be transparency about the evidence used and its reliability and relevance to the decision problem.
Rapid evidence synthesis in Greater Manchester
In order to ensure that decisions in Greater Manchester on innovation adoption and roll-out are informed by evidence, we developed a framework for the production of rapid evidence syntheses (RES) for innovations being considered for implementation in Greater Manchester. It has been made publicly available and registered,(11) and is provided here (see supplementary material). The framework builds on earlier work and experience in developing frameworks for briefings including those to directly inform decision-making by healthcare organisations.(12–14) It also has some similarities to other evidence briefing approaches which were identified in previous work.(15) However, we believe that the approach we have developed is unique in several important respects, and represents a development in combining the key considerations of speed, transparency, dual emphasis on robustness and relevance of evidence;(16, 17) and usability for stakeholders.
Speed and flexibility
The RES we produce are designed to be requested, undertaken and delivered in a time period of two weeks. We use a streamlined process to enable delivery in a considerably shorter timescale than other rapid evidence synthesis processes.(5–8) Our RES approach is also explicitly designed to take account of the fact that the evidence for innovations may be limited or of limited relevance, and incorporates protocols for dealing with innovations which are complex interventions.(18, 19) Flexible question sets have provisions both for category level appraisal of the evidence or for component analysis of innovations.(20) This means that even where there is very limited evidence for an innovation per se, an RES can be produced which is capable of informing implementation decisions. We explore examples of this in the section below on Flexibility in structuring the rapid evidence synthesis.
Integration
The production of RES is embedded within, and integral to, the Greater Manchester AHSN innovation pipeline decision-making process, rather than representing an input from a separate organisation. The HInM innovation pipeline draws on ARC and AHSN expertise in implementation science, health care decision-making and lived experience, as well as evidence synthesis and evaluation. RES is recognised as one of the necessary components of the decision-making process alongside public patient involvement (PPI) input, business case assessment and local health and social care stakeholders. The RES does not include recommendations, but its findings contribute to the decisions reached. The assessment of evidence, its relevance and certainty, is therefore integrated into the considerations. The researcher responsible for the RES attends the pipeline qualification meetings before and after the RES is produced, enabling resolution of queries at each stage, supporting integration of the evidence appraisal.
Transparency and consistency:
The principles of the GRADE evidence to decision framework are central to our approach to RES; which makes the RES, and the decision process to which it contributes, more transparent, consistent, and reproducible.(16, 17) GRADE provides a clear set of considerations for the formation of judgements about the strength of the evidence base for each question addressed, and is central to the consideration of the certainty and relevance of the evidence, which are inter-related. The available GRADE frameworks have undergone substantive developments since our previous work on evidence briefings.(12)
Relevance
The RES uses GRADE approaches to consider the applicability of the evidence to the Greater Manchester context as well as its reliability. Context, both broadly and narrowly considered, can be key to the impact of introducing an intervention.(21) In the cases of complex or service-level interventions, in particular, it can be difficult to determine the boundary between the intervention and the context.(22) For example a test for poor prognosis in heart failure (23) has potential implications for an entire treatment pathway,(24) emphasising the importance of adopting a test-and-treat approach.(25) We are also mindful of the fact that many innovations may look superficially simple but are being considered for introduction into complex systems such as primary care.(26)
Although in the first instance our RES consider relevance in terms of a UK-wide NHS context, we also consider the local Greater Manchester context. We do not generally undertake RES where an innovation is already a nationally mandated priority, so local context is potentially important to all the innovations assessed. Local context includes the existing service models, relevant infrastructure, area and population characteristics including urbanicity, relative deprivation etc. GRADE helps to ensure that relevance can be given equal prominence to certainty in evidence summaries. An example of identified limited relevance at the national level is where an RES of an innovation in asthma care identified only evidence from US trials. This is not directly relevant to asthma control in people with asthma in the UK, who at the population level have higher baseline control and use of preventative medication.(27) At the local level, an RES for an innovation improving connections between healthcare staff included evidence from a pilot project where one site was very rural.(28, 29) The evidence from that site was considered likely to be only indirectly relevant to Greater Manchester.
Process and stages of RES
We present a full example of an RES in supplementary material, a snapshot is shown in Fig. 1.
The key elements of the completed RES are shown in Box 1.
BOX 1: Structure of RES
• A headline summary of the certainty and relevance of available evidence
• A bulleted summary for each question addressed
• A description of the innovation
• A set of key questions
• A summary of the search process
• Detailed answers to each question addressed in terms of available evidence and its certainty and relevance.
• Bibliography of sources used to answer the questions
|
Timeframe and personnel
The RES is designed to produce a “good-enough” answer to contribute to decision-making in a short time-frame, rather than a perfect answer. The methods described are implemented using a median of up to two days of time for an experienced researcher with a background in evidence synthesis. More complex innovations may entail more resource for RES production. More details are provided in the framework (see supplementary information).(11)
Describing the innovation
The first stage is to briefly describe the innovation in terms of its nature and purpose (Box 2). This establishes the type of innovation (e.g. intervention/test/service delivery mechanism); the population or system that is targeted and the outcomes which should be considered. A comparator will usually also be identified through this process. This stage involves assessing and clarification of the information supplied by the sponsor.
Box 2. Example: Innovation description
Phagenyx is a device which is designed to reduce neurogenic dysphagia (dysfunction of eating).[1] This is dysphagia arising from the disruption of any of the neurological systems or processes involved in the execution of a coordinated safe swallow and occurs in people following stroke and in other conditions such as multiple sclerosis which impact muscle control. Dysphagia also occurs in people who have undergone sustained intubation for any reason. Phagenyx is classed as a pharyngeal electrical stimulation intervention and comprises a base station and a treatment catheter. It is applied over a period of days.
NICE guidance on the management of people with dysphagia following stroke states that they should be offered swallowing therapy at least three times a week, if they are able to participate, for as long as they continue to make functional gains. Therapy could include compensatory strategies, exercises and postural advice.[2]
|
Developing the questions
Using the description of the intervention, we formulate a series of questions based on the innovation description (Box 3). These begin with the most narrowly focused and move to wider category-based questions. These consider innovations in the same category and are key to production of a useful RES where evidence for the innovation is limited. Questions use the PICO(S) approach; defining the Population, Intervention (Innovation), Comparator, key Outcomes and Setting (where relevant).(30) The eligible study designs will always include existing evidence syntheses or, in their absence, the most relevant primary research design.
If the innovation is an intervention, then the questions will be ones of effectiveness and safety; where the innovation is (for example) a test or screening tool we consider accuracy as well as the impact on participants and health systems of implementing the technology. For complex interventions each core feature is described, and these are also taken into consideration in the question formation. When evaluating evidence for particular components of a complex intervention we are mindful of the fact that effectiveness in such an intervention may derive not solely from the additive effect of components but from their interaction with each other, as well as with the (often complex) system context.(31) We would therefore consider evidence relating to an individual component to be indirectly relevant to the innovation as a whole. There may be multiple questions of this form (to reflect different populations or comparators, for example).
Whilst we first focus on effectiveness evidence for the specific innovation being assessed, where this limited, we will explore evidence for (2) the category of innovations (“innovations like this”), and then (3) wider categories of relevant innovation (e.g. “innovations with a similar aim”). Categories are sometimes not obvious, particularly where the innovation is complex.(18–20, 22, 32) In the case of wider categories we may ultimately be looking at any intervention with a purpose similar to the index innovation or all interventions for the condition or issue under consideration (See Box 1). These subsequent questions are designed to ensure that useful evidence can be provided where the evidence for the innovation itself is absent, limited or not directly relevant. They are also required when innovations are tests, where the evidence for available treatments should be considered as a whole in the absence of test-and-treat evaluations.(25) These additional questions are designed to be addressed only where the evidence for the first question(s) is considered insufficient. Implementation of the sequential question set is then flexible and sensitive to the nature of the identified evidence.
Box 3. Example: Key questions
- Focus on specific innovation: What is the evidence for the impact of Phagenyx for key outcomes in people with neurogenic dysphagia compared to other interventions or to usual care?
- Focus on innovation category: If there is limited evidence for Phagenyx, what is the evidence for the impact of similar interventions (pharyngeal electrical stimulation) for key outcomes in people with neurogenic dysphagia compared to other interventions or to usual care?
- Focus on wider relevant innovations: If evidence for pharyngeal electrical stimulation is limited, what is the evidence for the impact of interventions for neurogenic dysphagia more generally?
|
Types of evidence
Our focus is always on those study designs best able to answer the questions we have developed. We focus on identification of existing evidence synthesis (systematic reviews) where possible; where this is not possible, we focus on the most informative primary evidence. In the case of most innovations this is from comparative studies, giving priority to randomised controlled trials. Where appropriate to the questions we also include diagnostic accuracy or prognostic studies. There are also questions, especially where the focus of an innovation is on patient experience, where mixed methods or qualitative studies will be the most appropriate form of primary evidence.
Identifying evidence
We adopt a pragmatic and iterative approach to identifying evidence. This uses an initially narrow focus to maximise relevance and progresses to a broader evidence base as necessary. We search key resources including NICE guidance;(33) PubMed; and the Cochrane Library, which includes both the Cochrane Database of Systematic Reviews and the Cochrane Central Register of Controlled Trials.
We increasingly encourage sponsors to provide research evidence for the innovation, as would be the case with a submission to NICE; we routinely search the sponsor’s website. Where appropriate we will use subject/domain-specific resources, such as the webpage of a particular Cochrane Group,(34) or the ORCHA database of health apps.(35) Where required we consult with an information specialist. We also use reference checking or forward citation searching of relevant evidence syntheses and primary studies.
Critical appraisal
We use appropriate methods to critically appraise the different types of evidence we identify.
Cochrane reviews are generally considered to represent reliable evidence and we use their summaries and assessments of evidence certainty rather than re-appraising the evidence, unless there are issues around relevance. Where possible with other high quality systematic reviews we will also use the existing assessments of evidence from the review. This approach maximises the use of existing high quality evidence while improving timeliness. We consider the quality of non-Cochrane systematic reviews, using the signalling questions from ROBIS as a guide.(36) We consider the possibility of duplication of evidence between multiple evidence syntheses.(37)
Where there is no existing evidence synthesis, or we have concerns about the robustness or relevance of a systematic review, we consider primary evidence for the question. We also move to assessing primary research where the existing synthesis has only partially addressed a question, for example because eligibility criteria were narrower. Conversely, where a review has a broader remit, we may look at the included primary studies relevant to our question. In the example of the Phagenyx RES we looked at the subgroup of RCTs assessing Phagenyx within the Cochrane review of interventions for dysphagia in stroke.(38)
Assessment of primary studies considers both the capacity of the study design(s) to answer the question, and an assessment of the risk of bias in the identified studies of the study to produce overall judgements of reliability. Because of our narrow timescale, we do not undertake full assessments but, as with ROBIS, are guided by the domains used. For example, for randomised controlled trials we are guided by the criteria and considerations of the Cochrane Risk of Bias tool,(39) for other study designs we consider questions posed by tools such as ROBINS-I; QUIPS etc.(40, 41)
Relationship with GRADE
In forming judgements about the certainty of the evidence we are guided by the principles of GRADE.(17, 42) GRADE assesses the certainty of evidence through evaluation of several domains in order to produce an assessment of high, moderate, low or very low certainty. The first domain is the risk of bias in the evidence, which we consider as outlined above. This is considered alongside questions of imprecision, inconsistency, and direct relevance of the evidence, and publication bias. There are adaptations of GRADE for non-effectiveness questions.(43, 44)
Apart from risk of bias the domains most relevant to our rapid evidence syntheses are usually imprecision (because of small sample sizes) and indirectness (often a function of context): there is often insufficient evidence to determine inconsistency between studies for the initial questions because there are usually only a small number of studies. The evidence for category-level questions is often of higher certainty than the evidence for the innovation itself; here the domains of inconsistency (and completeness of evidence (publication bias)) are more likely to be considerations.
We bear in mind that where inconsistency is present (as at the wider category level), this may be a consequence of either – or both – differences in the interventions or the systems in which they are evaluated as well as differences in participants or outcome measures. While some interventions are clearly complex, even simple interventions are frequently implemented into complex health systems and this is especially true of those which would represent changes in patient management.(26) This especially includes diagnostic and prognostics test, for which we always primarily ask about the effect of testing on the people involved and their management.(25)
Imprecision is usually the consequence of small studies with insufficient participants; this results in wide confidence intervals and effect estimates which would be highly likely to change with further evidence. Indirectness is also often an issue for some or all of the evidence. Because innovations assessed are novel there is often only a partial evidence base, where the evidence may be only indirectly relevant to many of the people in the question, although directly relevant to the group represented in the studies.
Our considerations of relevance (which GRADE considers as (in)directness) are key to our assessments. In addition to the consideration of indirectness which informs our assessment of the certainty of the evidence we also consider the relevance of the evidence to the context and health system in which the innovation would be implemented – in this case Greater Manchester in the UK.
Synthesis of the evidence
We use the identified evidence to produce narrative summaries of the evidence for the key questions in the RES. We always summarise evidence for core question(s) relating to the innovation, although we may identify little or no (useful) evidence. We provide a separate answer to each question addressed.
Where possible we summarise existing evidence syntheses, together with either their existing GRADE assessment or, if these are not available, a judgement based on our assessment of the GRADE considerations. We also provide an assessment of how relevant the evidence from the existing synthesis is to the question.
Where we have been unable to identify relevant existing evidence synthesis, we summarise the primary studies identified. We use a narrative summary to report effect estimates (with confidence intervals) and their certainty and relevance, very rarely would we seek to undertake meta-analysis.
We outline the certainty and relevance of the evidence for each outcome in the question, distinguishing where appropriate the population or subgroup to whom it is directly relevant. So in the Phagenyx population the evidence is directly relevant to people who have dysphagia following stroke, who represent a subgroup of people with dysphagia. We adopt the GRADE principle of assigning judgements around certainty to a particular outcome rather than at the study level. Where appropriate we report the evidence for each component of an intervention or intervention bundle (where there is no or very limited evidence for the whole). We provide as nuanced a summary of the evidence as possible, clarifying where evidence has different levels of certainty for different populations, components or outcomes. An example of a full RES is provided in supplementary information.
Producing a summary
We provide two levels of summary information, written in non-technical language.
The first provides a single brief summary of the evidence picture and highlights its certainty and relevance (Box 4).
Box 4. Example: Headline summary
Phagenyx may not change clinical outcomes in people with dysphagia following stroke (low to moderate certainty evidence) but probably increases the likelihood of decannulation in people with tracheotomy and dysphagia following stroke (moderate certainty evidence). This is based on randomised controlled trials. Evidence in neurogenic dysphagia in other conditions is limited.
|
The second provides a bulleted summary of the certainty and relevance of the evidence for each key question, including (e.g.) nuances of the population to which the evidence is directly relevant (Box 5). This may include aspects of the evidence where relevance to the NHS, or to Greater Manchester, is limited. In both sections, summaries include questions for which we identified no evidence, very limited evidence or very uncertain evidence. The summary follows the approach of the whole evidence synthesis and does not make recommendations to the decision makers.
Box 5. Example: bulleted summary
• Most evidence relates to people with neurogenic dysphagia following stroke. In this population:
• There is low to moderate certainty evidence from RCTs, including a moderately sized and methodologically strong trial, that Phagenyx may not change clinical outcomes in the general population of people with dysphagia following stroke. This is directly relevant evidence to the UK NHS.
• In people with dysphagia and tracheotomy following stroke there is moderate certainty evidence from small but well-conducted RCTs that decannulation is probably more likely in people treated with Phagenyx. This evidence is limited by imprecision but directly relevant to the UK NHS.
• There is indirectly relevant evidence from a Cochrane systematic review that, in people with dysphagia following stroke, swallowing therapy of any type probably has no effect on mortality but probably does reduce length of inpatient stay (moderate certainty evidence) and may reduce the proportion of people with dysphagia (low certainty evidence). Trials of Phagenyx contributed to this much wider review.
• There is limited non-randomised evidence assessing pharyngeal electrical stimulation in people with dysphagia due to causes other than stroke (people with multiple sclerosis and people in ICU).
• Further research may change the findings; the number of people involved is relatively low and new studies could substantially change the results.
|
Flexibility in structuring the rapid evidence synthesis
As described above, our question series has three possible levels: these relate to (1) evidence for the specific innovation, (2) evidence for innovation category and (3) evidence for wider relevant innovations. Our process involves addressing these questions sequentially, stopping at the point at which we have identified evidence of sufficient certainty and relevance. For the Phagenyx example, suitable innovation-specific (level 1) evidence was identified and no further evidence was required.(38, 45) In another example, a chatbot for mental health (46), there was limited innovation-specific evidence so the search was extended to evidence for the innovation type (level 2) question. (47, 48) For novel innovations that are not part of a wider innovation group only innovation-specific evidence will be relevant. (28)
The use of these questions sets allows us to be agile in our approach to RES. Where we consider a multicomponent or bundled innovation we can rapidly review evidence for the innovation as a whole and, where required, evidence for the innovation components. An example of this is the RES we carried out for RESTORE-2, a tool for care home staff which consists of three key components: identification of “softs signs” of possible physical decline; an early warning score and a structured communication plan. We identified limited evidence for the intervention as a whole,(49) so looked at level 1 and 2 questions, as required, for the different innovation components.(50–52)
The relevance of evidence reported in the RES is considered during subsequent decision making; with transparent and cautious extrapolation of indirect data where required. For example, in a RES for an innovation for both people with asthma and people with chronic obstructive pulmonary disease (COPD) we found only randomised evidence for people with asthma.(53, 54) and this was extrapolated to people with COPD in the absence of other suitable evidence, but we also considered a level 2 question for people with COPD.(55, 56)