Delirium Screening in the Emergency Department: A Systematic Review and Meta-analysis

Background: Delirium is a complex syndrome characterized by a disturbance in attention and awareness, with a prevalence of 10-20% in patients admitted to the Emergency Department (ED). Screening tools have been developed to identify delirium in the ED, but their accuracy of screening remains unclear. To address this challenge, we conducted a comprehensive meta-analysis to systematically review the accuracy of delirium screening tools currently being used to assess ED patients. Methods: PubMed, PsycINFO, EMBASE, and the Cochrane Library were searched. Studies involving ED inpatients which compared diagnostic tools with the Diagnostic and Statistical Manual of Mental Disorders (DSM) criteria as a reference standard were included. Two reviewers independently screened the studies, extracted data, and assessed the quality of studies using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 scale. We conducted a conventional meta-analysis for each screening tool. Then we used network meta-analysis method to calculate the relative sensitivity and specicity among the diagnostic tests. The diagnostic accuracies were then ranked through the superiority index. Results: Thirteen studies included six screening tools. The pooled sensitivity and specicity for the Confusion Assessment Method (CAM) were 0.71 and 0.98, and for 4AT (Arousal, Attention, Abbreviated Mental Test 4, Acute change) were 0.83 and 0.93, respectively. The other four tools used were only reported in one or two studies. Their sensitivity ranged from 0.70 to 1.00, and their specicity ranged from 0.64 to 0.99. Moreover, network meta-analysis indicated that the CAM and 4AT had a greater superiority index and a higher diagnostic accuracy. Conclusions: The available data suggested that both the CAM and 4AT can be used as ecient screening tools for the ED patients.


Introduction
Delirium is a neurology conative disorder characterized by acute onset, disturbed consciousness and uctuated course. In the emergency department (ED) [1], the prevalence reported as high as 10-20%, and 8-25% of old patients in ED present with delirium [2] . Studies have con rmed that patients with delirium tend to have poor outcomes including increased length of hospital stay, medical complications, increased risk of falls and higher mortality [3] [4] .
Although delirium is prevalent and associated with adverse outcomes, there is still three out of four patients missed delirium detection by bedside nurses and medical staff [5] [6] . Especially in the ED, the assessment of delirium is rarely done due to the high volume of patients and tense time demands on providers [7] . In the United States, emergency physicians miss about 75% cases of delirium each year [8] . Delirium screening is still a challenge for the ED staff. As the center of modern healthcare, ED should provide appropriate and rapid treatment in the rst time [9] . Therefore, it suggests a need for screening tools. Clinical practice guidelines recommend that a valid tool for delirium assessment is a crucial component in the detection of delirium [10] . An accurate screening tool could identify high-risk patients to reduce or prevent delirium occurrence and reduce the burden of delirium [11] .
Currently, several screening tools have been designed to support the assessment of delirium, but the screening in the ED has not been uniformly recognized [12] . Different screening tools have a variety of sensitivities and speci cities [13] . The time needed to complete the assessments also adds to the complexity of delirium detection [14] . Different guidelines provide different recommendations. The Scottish Intercollegiate Guidelines Network (SIGN) [15] recommends that in the ED, the 4AT (Arousal, Attention, Abbreviated Mental Test 4, Acute change) tool should be used for identifying delirium. The National Institute for Health and Care Excellence (NICE) [16] suggests that short Confusion Assessment Method (short CAM) should be routinely used to diagnosis delirium. Consequently, it is still not certain which screening tools to use in the ED.
Several systematic reviews have been conducted to summarize the nding of delirium screening tools in the ED, but they did not suggest which screening tool is better. Ewan [17] et al. summarized the results for delirium assessment and concluded that there is variability in screening methodology, the procedures to obtain consent and the methodological quality. A validated screening method is urgently needed to identify delirium in the early time. Lamantia [12] et al. concluded that there is still a lack of validated delirium screening tools in the ED.
José [18] et al. conducted a systematic search and found that the Confusion Assessment Method for the Intensive Care Unit (CAM-ICU) is the most widely used instrument, but not the most suitable for the ED. Although they all gave a comprehensive description of the screening tools for delirium in the ED, some studies focused on the effectiveness of screening tools rather than accuracy, and they did not conduct a quantitative meta-analysis to compare the different screening tools. With the emergence of new evidence, a systematic evaluation and meta-analysis is necessary. In addition, a pairwise meta-analysis could not provide a whole picture about the screening tools and assess the relevant diagnostic accuracy of different tools. Network meta-analysis has been used widely in interventional studies, that could compare the relevant effect of three or more interventions at the same time and effectively ranked the interventions to select the optimal treatment plan even in the absence of direct comparisons [19] .
This study aimed to evaluate the accuracy of different screening tools for ED patients by using a network meta-analysis method, and to rank different methods of assessment using the superiority index (SI).

Methods
We conducted a systematic review and network meta-analysis according to PRISMA (preferred reporting items for systematic reviews and meta-analysis) statements for diagnostic test accuracy [20] . The study protocol was registered on the PROSPERO registry (CRD42020153618).

Search strategy
PubMed, PsycINFO, EMBASE, and the Cochrane Library were searched from inception to December 2019. The search strategies were developed by QZ and guided by LG, who is an experienced evidence-based medicine researcher. The search terms were "delirium", "acute confusion", "diagnosis", "sensitivity" and "speci city". Completed details of the search strategies can be found in the Supplement Table 1.
The references of relevant systematic reviews and meta-analyses were also searched to identify potential studies.

Study selection
We included studies that met the following criteria: (1) population limited to ED patients; (2) index tests that included at least one delirium assessment tool for diagnosed patients (e.g., 4AT, CAM), which was compared with the reference standards (Diagnostic and Statistical Manual of Mental Disorders IV or V (DSM-IV or DSM-V).
(3) su cient information to calculate the crucial values to perform diagnostic analysis [22] including true positive (TP), false positive (FP), true negative (TN), and false negative (FN) values; and (4) cohort or crosssectional designs. Only studies published in English were included. We excluded editorials, commentaries, as well as pilot, case report, and duplicated studies.
EndNote X9 was used to manage the initial search records; after removing duplicate records, the remaining records were imported to Rayyan [23] , a free mobile app and web for systematic reviews. Two reviewers (ZQG and QZ) independently screened the titles and abstracts of all identi ed records. We then downloaded the full texts of the potential records to further review them for inclusion. Disagreements were resolved by discussion or through consultation with a third reviewer (LG). If the studies involved more than one assessor (either physicians or nurses) who used the screening tools, we included the one with the highest sensitivity in our meta-analysis. Data extraction was performed independently by two reviewers (QZ and MXC). Con icts were resolved by consensus or consultation with a third reviewer (LG).

Quality evaluation
Two reviewers (ZQG and QZ) independently assessed the risk of bias for each study as low, moderate, or high using criteria adapted from Quality Assessment of Diagnostic Accuracy Studies 2 [24] (QUADAS-2). This method comprises four domains: patient selection, index test, reference standard, and ow and timing. The answer risk for bias and applicability was rated as "no", "yes", or "unclear". We conducted the quality evaluation for each test method. For example, one study both assessed the CAM and 4AT on the same patients would have two QUADAS-2 assessments. Con icts were resolved by discussion. Uni ed results were solved by consulting a third reviewer (LG).

Pairwise meta-analysis
All statistical analyses were performed with STATA version 15.1 (Stata Corporation, College Station, TX, USA) with the programs "midas". We calculated the pooled accuracy estimates (sensitivity, speci city, positive and negative likelihood ratios (LRs), diagnostic odds ratio (DOR) across studies with 95% con dence intervals (CIs) using the bivariate mixed-effects regression model. The summary receiver operator characteristic (SROC) curve was plotted and the pooled area under the curve (AUC) value was calculated [25] .
Univariable meta-regression and subgroup analysis were planned to further explore the potential sources of heterogeneity [26] . A P < 0.05 for sensitivity or speci city was used to determine whether there was a statistically signi cant difference in sensitivity, speci city, or both among the levels of a particular covariate.
Deeks' funnel plot asymmetry test was used to assess publication bias. An unequal distribution in the visual funnel plot or a P-value of < 0.05 was considered to indicate statistically signi cant bias.

Geometry of the network
We drew a network plot show the geometry of the evidence. In the network plot, the research and evaluation size are proportional to the number of tested nodes, and the number of direct comparisons between tests is proportional to the thickness of the lines between the nodes. Evaluation of the presence of at least one test and at least one other from the remaining tests was done to assess network connectivity [27] .

Indirect comparison between competing diagnostic tests
We used the analysis of variance model in the R software V.3.4.1 (R Core Team, Vienna, Austria) to calculate the relative diagnostic outcomes between index tests, such as relative sensitivity, relative speci city, relative diagnostic odds ratio and superiority index.

Study selection
The search yielded 4904 records, after removing duplication, of which 4765 were excluded after title and abstract screening. Out of the 139 articles restricted the medical setting to the ED assessed, we excluded 100 records. Finally, 13 studies with 3023 participants were included in the analysis (Fig. 1) and the reasons for exclusion are presented in the Supplement Table 2.

Study characteristics and quality
We identi ed six different tools to screen for delirium. Figure 2

Results of meta-analysis
Some assessment tools only include one or two studies, so just studies that evaluated CAM and 4AT were included in the conventional meta-analysis. Table 2 presents the results of the meta-analysis.

Results of the network meta-analysis
The results of the network meta-analysis are presented in Table 3 and supplement Fig. 10. Ranking probability showed that the CAM and 4AT had the same highest sensitivity followed by RASS and the CAM had the highest speci city followed by 4AT. Also, the CAM had the highest superiority index, suggesting higher diagnostic accuracy, followed by the 4 AT and RASS. The order of superiority index is the same as the rst three assessment tools of sensitivity and speci city.

Discussion
This systematic review and network meta-analysis on the diagnostic accuracy of delirium screening tools is the rst attempt to summarize the evidence of recent studies. Our main nding is that in the ED, the 4AT showed higher sensitivity (0.83), and the CAM has higher speci city (0.98) and higher diagnostic accuracy indicating a good overall diagnostic performance.
It is clear from this study that delirium is common among the ED patients with the prevalence of 7-23%. All the included participants were over 65 years old. Age as one of the risk factors for delirium makes it more common among the elderly. Due to the special nature of the ED, most studies do not limit the types of diseases. Different assessors could lead to different sensitivity and speci city. When choosing an instrument, training should also be considered before use. Most of the studies were conducted by trained nurses or geriatricians. Evelien [29] et al. compared the 4AT with the CAM, special training on the CAM were conducted while the 4AT did not because the speci c training for the 4AT is not required. These factors may affect the sensitivity and speci city of the screening tools.
Although the CAM has good accuracy as the routine use for delirium screening, it may not be ideally suited for the ED due to the time taken. The assessment of the CAM is based on only four cardinal elements: 1) an acute onset of mental status changes of uctuating course; 2) inattention; 3) disorganized thinking; 4) an altered level of consciousness. The patient is diagnosed as delirious if he has both features 1 and 2, and either feature 3 or 4 [30] . Before assessing delirium with the CAM, it usually takes 5-10 minutes to complete bedside interviews and short cognitive tests before the screening. Wolfgang [31] et al. developed modi ed Confusion Assessment Method for the Emergency Department (mCAM-ED), which takes about 3.2 min. Differently, the 4AT is a newly developed screening tool for rapid initial assessment of delirium (www.the4AT.com). It is brief (generally < 2 min) makes it more suitable for delirium screening particularly in the ED which there is a limited time to perform diagnosis.
Through the network meta-analysis, the CAM has higher speci city than 4AT, which is consistent with the results of our meta-analysis. It is worth noting that all existing screening tools were included to validate our ndings.

Comparison with other studies
Michael [12] and Pilar [32] conducted systematic reviews to describe the delirium screening tools, but they provide limited evidence to compare the accuracy of each screening tools. Unlike these previous researches, we included more recent published studies to perform a pairwise meta-analysis and network meta-analysis to draw our conclusion. We did a detailed analysis of the included studies and provide evidence for the selection of screening tools for the ED staff.

Strengths and Limitations
This review provides a comprehensive outline of the use of delirium screening instruments in the ED. Our research has the following advantages: (1) in the review, we used a comprehensive search with explicit inclusion and exclusion criteria; (2) study selection and data extraction were conducted independently. (3) we assessed the risk of bias of the individual studies, which increased the validity of our conclusions; (4) we used the advanced meta-analysis methods -indirect comparison -to compare different screening tools and rank their relevant superiority. However, there are some limitations in our research that only studies published in English were included. It is worth noting that four screening tools in our network meta-analysis only included one or two studies, this may lead to unstable outcomes.

Conclusion
The present meta-analysis suggests that both CAM and 4AT show better diagnostic accuracy in detecting delirium for the ED patients, however, 4AT shows less screening time when compared with the CAM. More high-quality studies are needed to assess the accuracy of other delirium screening tools.

Ethical approval
For this type of study formal consent is not required.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.