Description of included articles
The database search was conducted from 24 September 2018 to 14 January 2019. It was updated on 4 November 2019. We screened a total of 12,828 records after removing duplicates. We assessed 115 full-text articles for eligibility and from those, 67 were excluded based on at least one exclusion criterion. Figure 1 presents the PRISMA flow diagram for inclusion of the 48 relevant papers (13, 21-67).
The articles included were published between 1989 and 2019. Most of the studies were conducted in Canada (22, 23, 25, 26, 30-33, 40, 42, 48, 50, 53, 57, 59, 61, 67), Spain (21, 28, 29, 37, 38, 41, 52, 54-56, 60, 62), and New Zealand (27, 34-36, 49, 58, 63). As presented in Table 1, these studies’ stated goals were mainly related to evaluating validity and reliability, developing prioritization criteria, and creating prioritization tools.
Three development processes of similar tools stand out in our review based on the number of studies conducted, one from the Western Canada Waiting List Project (68), which produced 12 studies included in our review (22, 26, 30-33, 40, 42, 48, 57, 59, 61), one from Spanish research groups including 10 studies (21, 28, 29, 37, 38, 41, 55, 56, 60, 62), and one from four New Zealand researcher studies (34, 35, 36, 49). They are reviewed in more detail in the following paragraphs.
Western Canada Waiting List (WCWL) Project
The WCWL Project was a collaborative initiative undertaken by 19 partner organizations to address five areas where waiting lists were considered to be problematic: cataract surgery, general surgery, hip and knee replacement, magnetic resonance imaging (MRI), and children's mental health services (68). Hadorn’s team developed point-count systems using statistical linear models (68). These types of systems have been developed for many clinical and non-clinical contexts such as predicting mortality in intensive care units (APACHE scoring system (69)) and neonatal assessment (Apgar score (70)). In this project, the researchers developed tools using priority scores in the 0-100 range based on weighted prioritization criteria. Based on the studies included in our review, a panel of experts adopted a set of criteria, incorporated them in a questionnaire to rate a series of consecutive patients in their practices, and then used regression analysis to determine the statistically optimal set of weights on each criterion to best predict (or correlate with) overall urgency (42, 57, 59, 61). Reliability (22, 33, 40, 42, 57, 61) and validity (26, 30-33) were also assessed for most of the tools created by this research team and the key results are presented in Additional file 4.
Spain
A group of researchers in Spain developed a four-step approach to designing prioritization tools. The four steps included: 1) systematic review to gather available evidence about waiting list problems and prioritization criteria used, 2) compilation of clinical scenarios, 3) expert panel consultations provided with the literature review and the list of scenarios, 4) rating of the scenarios and criteria weighting, carried out in two rounds using a modified Delphi method. The researchers then evaluated the reliability of the tool in the context of hip and knee surgeries (21, 38, 55) as well as its validity for cataract surgeries (21, 37, 41, 56) and these results are detailed in Additional file 4.
New Zealand
The third tool detailed in our review was developed by New Zealand researchers. Clinical priority assessment criteria (CPAC) were defined for multiple elective surgery settings (35, 36, 49) based on a previous similar work (34). Validity results of these tools are presented in Additional file 4. These tools were part of an appointment system aimed at reforming the access to elective surgery policy in order to improve equity, provide clarity for patients, and achieve a paradigm shift by relating likely benefits from surgery to the available resources (36). The development process was not described in detail in the studies included. As discussed in one study, implementation of the system encountered some difficulties, mostly in achieving consensus on the components and the weighting of the various categories of prioritization criteria (36).
Designs and quality of included studies
We appraised the methodological quality of the 48 articles using MMAT, which allowed for quality appraisal based on the design of the studies assessed. One was a mixed methods study (50), four were qualitative studies (36, 43, 48, 60), five were quantitative descriptive studies (35, 37, 49, 52, 66), none were randomized controlled trials, and the remaining 38 were quantitative non-randomized studies. From these quantitative studies, most were cross-sectional, prospective, or retrospective design studies. The overall methodological quality of the articles was good, with a mean score of 3.81/5 (range from 0 to 5). Score associated with each study is presented in Table 1.
Characteristics of the prioritization tools
We listed 34 distinct PPTs from 46 articles reviewed. Two articles (46, 48) were included even though no specific PPT was used or developed in these studies. In Kingston and Masterson’s study (46) (MMAT score = 0), The Harris Hip Score and the American Knee Society Score were used as the scoring instruments to determine priority in the waiting list, and in McGurran et al.’s article (48), the (MMAT score = 2), authors consulted the general public to collect their opinion on appropriateness, acceptability, and implementation of waiting list PPTs. Table 1 shows that PPTs were mostly used in hospital settings (19/34) for arthroplasty (9/34), cataract surgery (5/34), and other elective surgeries (4/34). We found that a different set of tools support prioritization in 14 other healthcare services. Three tools were designed for primary care, three for outpatient clinics, one for community-based care, and one for rehabilitation. Three studies (26, 43, 52) portrayed the use of a PPT in multiple settings, and four studies (39, 42, 51, 64) did not specify the setting. The PPTs reviewed were mostly (17/34) tools attributing scores ranging from 0 to 100 to patients based on weighted criteria. Other tools (8/34) used priority scores that sorted patients into broad categories (e.g. low, intermediate, high priority). The format of the PPTs reported in the studies were mostly unspecified (26/34), but some explicitly specified that the tool was either in paper (4/34) or electronic (2/34) format.
Several stakeholders were involved in the development of the 34 PPT retrieved, such as clinicians (50% of the PPT), specialists (35%), surgeons (29%), general practitioners (26%), and others (Figure 2). It is worth mentioning that patients and caregivers were involved in only 15% of the PPTs developed (21, 38, 52, 60, 63, 65), while for 21% of the PPTs, authors did not specify who participated in their development.
Regarding the development process, we have not identified guidelines or standardized procedures explaining how the proposed PPTs were created. Some authors reported relying on literature reviews (44%) and stakeholder consultations (53%) to inform PPT design, but most provided very little information about the other steps of development.
Below is a review of the criteria elected to produce the different PPTs found in this synthesis. First, the number of criteria ranges from 2 to 17 (mean: 7.6, SD: 3.8). As regards their nature or orientation, some PPTs are related to generic criteria, others are specific to a disease, a service, or a population, as reported in Table 2. The criteria of each PPT are listed in Additional file 3.
In summary, PPTs are typically used in hospital settings for managing access to hip, knee, cataract, and other elective surgeries. Their format is undefined, their development process is non-standardized, and they are mostly developed by consulting clinicians and physicians (surgeons, general practitioners, and specialists).
Reliability and validity of PPTs
Only 26 out of the 48 articles included in this synthesis, representing 23 tools (68%), reported an investigation of at least one of the qualities of the measuring instrument, i.e. reliability and validity. Figure 3 displays the scope of aspects that were assessed.
Inter-rater (21-24, 33, 38, 40, 42, 45, 47, 50, 55, 57, 61, 65) and intra-rater (22, 24, 33, 40, 42, 47, 55, 57, 61) reliability were evaluated by comparing the priority ratings of two groups of raters (inter) and by comparing priority ratings by the same raters at two different points in time (intra). Face validity (33, 38, 41, 47, 52, 55) was determined by consultation with stakeholders (e.g. surgeons, clinicians, patients, etc.). Other validity assessments, such as content (41, 47, 55), construct (21, 23, 26, 27, 30-32, 34, 35, 37, 52, 63), and criterion (26, 32, 35, 41, 45, 47, 52, 55) validity were appraised using correlations between PPT results and other measures. In fact, some studies compared PPT scores with a generic health-related quality of life measure such as the Short Form Health Survey (both SF-36 or SF-12) (35, 55, 63). Aside from correlations with other measures, PPT validity was evaluated using two other means: disease-specific questionnaires (e.g. the Western Ontario and McMaster Universities Arthritis Index (27, 30, 31, 37, 55) and the Visual Function Index (35, 41)) or another measure of urgency / priority (e.g. the Visual Analogue Scale (21, 22, 26, 30-32, 42, 61) or a traditional method (47)).
One of the objectives of this review was to synthesize results about the quality features of each PPT. The diversity of contexts, settings, and formats PPTs adopted made a fair and reasonable comparison almost impossible. We observed various methods of assessing reliability and validity of PPTs across a number of settings. All the findings relating to the features reported in the articles are presented in Additional file 4. We can conclude that the reliability and the validity of PPTs have generally been assessed as acceptable to good.
Effects on the waiting list process
Assessing actual benefits remains, in our opinion, one of the most important drawbacks of the reported PPTs. In fact, we found that only 10 studies (13, 28, 29, 39, 44, 46, 50, 58, 62, 64) investigated the effects or outcomes of the proposed PPTs, while six other studies (23, 36, 48-50, 67) merely reported opinions expressed by stakeholders about essential benefits and limitations of PPTs.
Waiting time is the most studied outcome assessment (13, 28, 29, 39, 58, 62, 64). Four studies (28, 29, 39, 62) used a computer simulation model to evaluate the impact of the PPT on waiting times. In their simulations, the authors compared the use of a PPT to the FIFO model, and reported mixed findings. Comas et al. (28, 29) showed that prioritization systems produced better results than a FIFO strategy in the contexts of cataract and knee surgeries. They concluded that the waiting times weighted by patient priority produced by prioritization systems were 1.54 and 4.5 months shorter than the ones produced by FIFO in the case of cataract and knee surgeries, respectively. Another study (39) revealed, in regard to cataract surgery, that the prioritization system concerned made it possible for patients with the highest priority score (91-100) to wait 52.9 days less than if the FIFO strategy were used. In contrast, patients with the lowest priority score (1-10) saw their mean waiting time increase from 193.3 days (FIFO) to 303.6 days (39). Tebé et al. (62) noted that the application of a system of prioritization seeks to reorder the list so that patients with a higher priority are operated on earlier. However, this does not necessarily mean an overall reduction in waiting times. These authors concluded that although waiting times for knee arthroplasty dropped to an average wait of between 3 and 4 months throughout the period studied, they could not ascertain that it was directly related to the use of the prioritization system (62).
The other three studies conducted a retrospective analysis of patients on waiting lists (13, 58, 64). With a PPT (13) using a total score of priority then sorting patients in groups (group 1 having the greatest need for surgery and group 4 the least need), the mean waiting time for surgery was 3 years shorter across all indication groups. In a study with patients waiting for cardiac surgery, the clinician’s classification was compared to The New Zealand priority scores (0-100) based on clinical features (58). According to this study, it is difficult to determine whether waiting times were reduced as a result of the use of the PPT, because findings only showed the reorganization of the waiting list based on each category and priority scores. However, waiting times were reduced for the least urgent patients in both groups. Waiting times before surgery were between 161 and 1199 days based on the clinician’s classifications compared to waiting times related to New Zealand priority scores, which were between 58 and 486 days (58). In addition, Valente et al. (64) studied a model to prioritize access to elective surgery and found no evident effects in terms of reduction or increase of the overall waiting list length.
Effects on the care process
Some studies address the effects of PPT on the demand for healthcare services (44, 46, 50, 64). The introduction of a new need-based prioritization system for hip and knee arthroplasty has reduced the number of enquiries and cancellations (46). Changes were also observed after the implementation of a PPT for physiotherapy services with an increase of 38% of high priority clients in their caseload (50). PPTs also had an impact on the healthcare delivery process. For example, Mifflin and Bzdell (50) reported improvement in communication between physiotherapists and other health professionals in remote areas. Furthermore, Isojoki et al. (44) demonstrated that the priority ratings made by experts in adolescent psychiatry were correlated with the type and duration of the treatment received. This suggests that the PPT identified adolescents with the greatest need of psychiatric care and that it might, to some extent, predict the intensity of the treatment to be delivered (44). The system proposed by Valente et al. (64) enabled easy and coherent scheduling and reduced postponements (64).
Other outcomes related to the use of PPTs
Although some attempts have been made to assess the positive impacts of PPTs on patients, the results reported are not consistent enough to confirm such benefits. Seddon et al. (58) stated that priority scores for cardiac surgery prioritize patients as accurately as clinician assessments do according to the patients’ risk of cardiac events (cardiac death, myocardial infarction, and cardiac readmission). In their study concerning a PPT for patients waiting for hip and knee arthroplasty, Kingston and Masterson (46), used two instruments to measure patients’ priority scores (The Harris Hip Score and the American Knee Society Score), and the mean joint score for patients on the waiting list remained unchanged a year after the introduction of the new system.
Benefits and limitations of PPTs
The challenge of producing long-term assessments of PPT benefits has led researchers to rely on qualitative methods, i.e. stakeholder perceptions concerning acceptability and benefits of PPTs. Focus groups including the general public indicated that the tools presented in five different clinical areas[1] were appropriate and acceptable (48). Other studies (23, 36, 49, 50, 67) examined perceptions of PPT stakeholders – clinicians, managers, and surgeons – about the tools. Clinicians using a PPT in clinical practice stated that it promotes a shared and more homogeneous vision of patients’ needs, and that it helps to gather relevant information about them (23). It also improved transparency and equity for patients, as well as accuracy of waiting times (36). In another study, physiotherapists reported increased job satisfaction, decreased job stress, and less time spent triaging referrals (50). They also commented that, compared to the methods used before the tool was introduced (50), PPT allowed physiotherapy services to be delivered more equitably. In Rahimi et al.’s study (67), surgeons reported that the PPT provides a precise and reliable prioritization that is more effective than the prioritization method currently in use.
On a less positive note, some authors reported that PPTs were perceived as lacking flexibility, which limited their acceptance by surgeons (36, 49). In a study surveying surgeons about the use of PPTs, only 19.5% agreed that current PPTs were an effective method of prioritizing patients, and 44.8% felt that further development of surgical scoring tools had the potential to provide an effective way of prioritizing patients (49). In fact, most surgeons felt that their clinical judgment was the most effective way of prioritizing patients (49). Many studies mentioned the need to support implementation of PPTs in clinical practice and to involve potential tool users in the implementation process (21, 31, 48, 57, 60). In this vein, another recommendation concerning implementation was to secure agreement and to assess acceptability of the criteria and the tool in clinical settings (60). A panel of experts recommended that a set of operational definitions and instructions be prepared to accompany the criteria in order to make the tool more reliable (57). Implementation should also involve continuous monitoring and an evaluation of the effects of implementation on patient outcomes, on resource use, and on the patient-provider relationship (31).
[1] Hip and knee joint replacement; cataract removal surgery; general surgery; children’s mental health services; and MRI scanning.