Selection of syndromes for the ED-SyS: An overview of the Yukon ED-SyS is presented in Fig. 1. In total, seven syndromes of interest were selected for daily monitoring; these included: gastrointestinal illness (GI), influenza-like-illness (ILI), mumps, neurological infections (Neuro), rash, respiratory illnesses (Resp), and COVID-19. For each syndrome, diseases/conditions of interest were identified, and case definitions were developed based on algorithms that query information from the WGH ED-Tracker database (see Table 1). A complete list of terms used to query each syndrome is available in supplemental material (Supplement A1).
ED-SyS validation and performance: The merged dataset of ED visit data from the ED Tracker and NACRS contained a total of 19,023 unique ED visits between October 1, 2018 and April 30, 2019. After applying the initial case definitions to this dataset, GI and Resp were the only two syndromes that greatly exceeded our target case sample of approximately 500 hits for validation. For these two syndromes, a subset of ED visits from November 1, 2018 to January 31, 2019 was used for validation (n = 8246 records). In total, our original case definitions flagged 3,707 ED visits as potential cases for all syndromes (1,793 for GI; 966 for Resp; 593 for ILI; 64 for Rash; 234 for Neuro; 57 for Mumps; and 0 for COVID-19). As no records were flagged for COVID-19, the syndrome was not included in validation procedures.
Results from our validation are presented in Table 2. Among our initial case definitions, DD consistently returned the highest proportion of true positive cases (PPV: 51.3–100%), compared to CN (PPV: 22.8–86.1%) and CC (PPV: 0–35.0%) when used individually. In general, the CN field produced the most sensitive results, flagging the highest number of visits for all syndromes, while CC and DD fields provided a lower number of hits with higher specificity. These trends were maintained after adjustment to the initial case definitions, with the largest improvements observed in the GI syndrome (e.g., PPV for DD improved from 51.3–86.9%).
Combining multiple levels of data from ED-Tracker produced an average of 6.7-fold more hits to each syndrome than querying each component individually (Table 2). The largest example of this was observed when querying for visits related to the Resp Syndrome; using CC alongside CN increased hits from our original case definitions over 33-fold. Using our final case definitions for the combined fields produced the largest improvements to the GI Syndrome, with the PPV for the CN/CC combination increasing to 44.5% from 22.3% and the CN/CC/DD combination increasing to 78.8% from 48.8%. Changes observed among the other syndromes were minimal, save for the PPV of detecting the Mumps Syndrome, which increased from 50.9–94.1%, although the total number of observed hits decreased from 57 to 17.
Evaluating terms and logic used to identify syndromes: After an initial review of the ED-SyS performance using our validation dataset, several adjustments were made to the terms and algorithms used in our ED-SyS to improve the system’s performance within the local context. In general, three areas were useful in redefining ED-SyS queries to provide more accurate results in the Yukon setting
(1) Misspellings or shorthand among the free-text fields that were not considered during the development of the initial algorithms. For example, the acronym “LWBS” in the DD field was often used to indicate a patient had “left without being seen”, and additional acronyms were needed to describe the diarrhea, nausea, and vomit concepts within the GI syndrome case definition including “N&V” (nausea and vomiting), “N, V, D” (nausea, vomiting, diarrhea), and “V/D” (vomiting/diarrhea).
(2) Terms and CEDIS code that could be used to further refine the inclusion or exclusion of true- and false-positives, respectively. For example, additional negating terms were identified for several syndromes; the inclusion of “no”, “denies” and “–” were therefore added to indicate negation within the diarrhea, nausea, vomit, fever, and cough concepts. Several CEDIS codes were consistently present among true- and false-positive results for each syndrome; we used these codes to help provide additional specificity to our inclusion and exclusion criteria based on local context (Table 3). For example, a high proportion of visits flagged by our GI case definition presented with abdominal pain (CEDIS 251), but were validated as unclear/conflicting information (n = 91/571, classification = “1”) or unrelated (n = 365/571, classification = “2”). For this reason, we elected to remove CEDIS 251 from the final GI syndrome case definition in addition to several related terms that were present in the original syndromic case definition at the CN and DD level (“ascites”, “LUQ”, “LLQ”, “RLQ”, “RUQ”, “diverticulitis”, “appendicitis”, “abdominal bloating”, and “flatus”) and were also associated with false-positive records.
(3) Adjustments to the algorithm logic that could improve the application of our algorithms. For example, the original algorithm for detecting the Neuro syndrome queried the CN and CC fields only if the information from the DD field was blank. We modified this process to query CN and CC when the DD field contained at least one of the terms “febrile seizure,” “headache nyd,” “malaise nyd”, or “sepsis” without mention of the term “mening”. This improved both the sensitivity and specificity of the algorithm, increasing the number of hits from six to 19, while concomitantly increasing PPV from 83.3–89.5%.
After completing the above adjustments to improve case-finding, we visualized the most influential terms among the true-positive case records for each syndrome (Fig. 2). Each syndrome appeared to have several terms that were essential to identifying the syndrome of interest. In three syndromes (Rash, Resp, and ILI) CEDIS codes appeared among the most frequent terms.