Development and Validation of an Automated Emergency Department-Based Syndromic Surveillance System for Use at a Mass Gathering Event (2020 Arctic Winter Games, Yukon Canada) During a Global Pandemic: Implications for Lower Resourced and Remote Settings

Background: Automated syndromic surveillance systems are useful tools for rapidly identifying health risks during times when routine surveillance and follow-up cannot meet the demands of the population. In Yukon, Canada, the Arctic Winter Games were scheduled in March 2020, and were expected to increase the local population beyond the capacity of local public health surveillance. An emergency department-based automated syndromic surveillance system was therefore developed and validated using local hospitalization records for use during the event. Methods: Syndromes of interest were identied in consultation with the local public health authorities. For each syndrome, case denitions were developed using published resources and expert elicitation. Natural language processing algorithms were then written to detect syndromic cases from three different elds (triage notes; chief complaint; discharge diagnosis) using Yukon emergency department case data containing information from 19,082 visits over the period of October 1, 2018 to April 30, 2019. The automatic identication of cases was then manually validated by two raters and results were used to calculate positive predicted values for each syndrome and identify improvements to the detection algorithms. Results: A total of six syndromes were originally identied for the syndromic surveillance system (e.g., Gastrointestinal, Inuenza-like-Illness, Mumps, Neurological Infections, Rash, Respiratory), with an additional syndrome added to assist in detecting potential cases of COVID-19. The positive predictive value for the automated detection of each syndrome ranged from 48.8%-89.5% to 62.5%-94.1% after implementing improvements identied during validation. As expected, no records were agged for COVID-19 from our validation dataset. However, the system was rapidly adapted into an additional surveillance tool for use in the COVID-19 pandemic. Conclusions: Validation is an important step for measuring the accuracy of syndromic surveillance, and ensuring it performs adequately in a local context. Ultimately, the 2020 Arctic Winter Games were cancelled due to the risks associated with mass gatherings during the global pandemic of COVID-19 and could not therefore be tested under a mass gathering scenario. However, the results from our validation study suggest that our surveillance system may be useful for future mass gathering events and proved a timely development for integration into Yukon’s COVID-19 surveillance Our study highlights the feasibility of implementing an automated ED-SyS and validating syndrome case denitions in a low-resourced/remote setting using simple tools, resources, and adapted gold standard methods. Our approach to developing an ED-SyS allowed us to move away from “drop-in” paper-based methods to create a “low-tech” sustainable system that can be leveraged for other mass-gathering events, other emerging health issues of concern, and general ongoing surveillance. Our study also reinforces the importance and value of validating syndrome case denitions using local data. Importantly, our study provides a path forward for other lower-resourced rural/remote settings on how to develop and validate syndrome case denitions.


Introduction
The Arctic Winter Games (AWG) is an international sporting and cultural event that brings together athletes from countries in the Circumpolar North, including Canada; U.S.A.; Greenland; Norway; Sweden; Finland; and Russia. This mass gathering of athletes, coaches, and media was scheduled to occur in Whitehorse, Yukon in March 2020, and was anticipated to impose considerable pressure on the territory's public health infrastructure. In other massgathering sporting events, vaccine-preventable diseases such as measles, in uenza, and meningococcal and gastrointestinal illnesses have been reported [1,2]. Moreover, the pandemic of novel coronavirus disease (COVID- 19) and an ongoing outbreak of pertussis in northern Canada underscored the need for enhanced surveillance during the AWG, especially in a lower-resourced rural/remote setting such as Yukon, where meeting surge demand in the event of an outbreak represents a signi cant challenge. This paper describes a novel, near real-time automated emergency department (ED) syndromic surveillance system (SyS) that was developed and validated in preparation for the 2020 AWG and integrated into Yukon's COVID surveillance infrastructure.
Emergency department-based SyS are recognized tools for enhanced surveillance that complement traditional laboratory-based surveillance methods [3][4][5][6]. These systems automate the use of existing (near-)real-time pre-diagnostic data that is routinely collected in hospitals and apply statistical algorithms to detect aberrations for immediate public health investigation. While it is recommended that investment in mass gathering surveillance should provide a system that is sustainable for long-term use [2], this historically has not been the case in Yukon.
Previous mass gatherings in the territory used daily "drop-in" paper-based data collection methods that, while effective at the time, were resource-and labor-intensive to be sustainable long-term. Although automated SyS are often perceived to require complex technology or signi cant human resources, they do not need to be highly technical or costly to provide lasting bene t once established [7,8].
To properly support an ED-SyS, operational case de nitions must be developed and validated. These in uence the balance between identifying all possible cases (i.e., sensitivity) and excluding those without the disease of interest (i.e., speci city) [1,2]. There is no gold standard approach for developing or validating ED-SyS case de nitions; methods described in literature suggest developing de nitions using expert-based consensus followed by an ongoing re nement process, with validation based on chart review or comparison with a gold-standard dataset [9][10][11][12][13][14][15][16][17]. Most case de nitions rely on algorithms that identify keywords associated with a syndrome of interest in the chief complaint (CC) eld and discharge diagnosis (DD) elds [6,16,[18][19][20][21][22][23], with the emerging use of clinical triage notes (CN) [16,[24][25][26]. Several studies have noted disagreement between syndrome de nitions when using CC elds versus DD elds but, in general, case detection improves when leveraging information from both elds [11,16,22,24,25,27,28]. The use of CN elds also increases the sensitivity of the de nition but may decrease speci city [16,[24][25][26].
There is currently a need to identify methods that lower-resourced communities can use to develop and operate automated ED-SyS that approach the e cacy of a fully validated real-time system, without the steep setup time and resource costs required by more sophisticated software. Moreover, these systems should be validated and optimized for use in the unique communities where they are applied. The objectives of the present study are therefore to describe the development and validation of Yukon's rst automated ED-SyS and evaluate the effectiveness of three different ED data elds. Taken together, these results may be used to inform the development of novel ED-SyS in communities similar to Yukon that lack the resources required for larger, more complex surveillance systems. While the AWG event was ultimately canceled in light of COVID-2019 physical distancing precautions, ED-based SyS described here may provide lasting bene t beyond the potential for use during a single mass-gathering event, in particular among similarly resourced rural community settings as Yukon.

Methods
The design and validation of Yukon's ED-SyS occurred between November 2019 and February 2020, involving the following stages: (i) initial review of available data sources; (ii) development of initial case de nitions; (iii) development of natural language processing algorithms and logic; (iv) validation of ED-SyS using validation dataset; (v) re nements to algorithms and logic.
Data Source: Emergency department records from Whitehorse General Hospital (WGH), the only hospital in Whitehorse and Yukon's primary hospital, were used to perform this study. The Meditech ED-Tracker system at WGH is an electronic medical records database that captures demographic characteristics, date of visit, clinical notes (CN) (e.g., free-text describing a brief history of the stated complaint, the recorded temperature (in °C) at triage); the chief complaint (CC) containing the Canadian Emergency Department Information System (CEDIS) code [29,30]; and the physician discharge diagnosis (DD) which is a free-text eld providing the diagnosis at patient discharge.
ED-Tracker data from visits that occurred after January 1st, 2018 were provided for use. However, data before October 1st, 2018 were inconsistent (e.g. incomplete data and/or missing elds) and therefore excluded from the study. National Ambulatory Care Reporting System (NACRS) records between January 1st, 2018 and April 30th, 2019 were also used as a reference in our validation process for the inclusion of International Classi cation of Disease, 10th edition (ICD-10) codes. All analyses were performed using Stata LP 15.1 (Texas, USA).
Development of initial case de nitions: Syndromes of interest were identi ed via consensus between Yukon's Communicable Disease Control Program (YCDC) Manager, the o ce of the Chief Medical O cer of Health (OCMOH), and the Territorial Epidemiologist. A review of existing syndromic surveillance platforms (e.g., NC-Detect, ESSENCE-II) [31,32] along with literature searches were used to inform the terms used in the initial case de nitions for each syndrome [6, 10, 12, 13, 15-19, 21, 22, 24, 33-40]; these were further re ned for local context by review from stakeholders at YCDC and the OCMOH. In addition to key terms, CEDIS codes related to each syndrome were also identi ed and included in case de nitions. Additionally, a syndrome related to COVID-19 was developed following recommendations published by the World Health Organization (WHO) [41] and local consensus with the OCMOH. This case-nding de nition was further developed through extensive internal review, a review of WHO's COVID-19 weekly Situation Reports [42], and recommendations made available online by the Public Health Agency of Canada [43]. In brief, individual terms and codes related to each syndrome were organized into more broadly de ned concepts that were then used to determine whether a patient record met the case de nition for the syndrome of interest (see Supplement A1). Using these key terms and concepts we developed algorithms to query ED-Tracker records from October 1, 2018 to April 30, 2019 as this dataset coincided with available NACRS data. Algorithms were designed to query each data eld using a forward-inclusion strategy, typically starting with DD, then querying the CC and CN elds (see Table 1).
Validation and re nement of initial syndromic surveillance case de nitions: Records with an exact match between ED-Tracker and NACRS datasets (i.e., visits occurring between January 1st, 2018 and April 30th, 2019) were used to validate the ED-SyS. For syndromes that returned a large number of records (e.g., > 600) identi ed as positive, a subset of the data containing records from November 1, 2018 to January 31, 2019, was instead used. Following the example set by [25], records agged for meeting syndromic case de nitions were assigned one of three validation classi cations: potentially related to the disease of interest (true-positive) = "0"; unclear/con icting information (false positive) = "1"; does not match case de nition/not related (false positive) = "2" (Table 1). These assignments were carried out rst by reviewing available ICD-10 codes; any agged case with an ICD-10 code equivalent to the those outlined in [44] was assigned a zero ("0"). This step was accomplished using the US-CDC's database of equivalent code translation (available at: https://icd.codes/convert/icd9-to-icd10-cm). Remaining records were validated manually by two epidemiologists reviewing the ICD-10 code, DD, CC and CN elds, typically in that order. After an initial review, a random sample of 10 cases was taken from the "0", "1", and "2" sub-groups for all syndromes (30 cases total per syndrome) and reviewed by both epidemiologists to ensure agreement and consistency among classi cation.
Following validation, we manually reviewed the validated records from each syndrome to identify: (1) misspellings or shorthand among the free-text elds that were not considered during the development of the initial algorithms; (2) terms and CEDIS codes that could be used to further re ne the inclusion or exclusion of true-and false-positives, respectively; and (3) adjustments to the algorithm structure to improve case classi cation.
Evaluating the performance of the ED-SyS: Each syndrome included in the ED-SyS was evaluated by measuring the positive predictive value (PPV) using the validated case records. The PPV was de ned as the total number of true-positive records (classi cation= "0") over the total number of records (classi cation="0", "1", or "2"), as a percent. To evaluate the contribution of each data eld on the overall performance of the ED-SyS, we measured the PPV from each component individually as well as in combination (e.g., CN; CC; DD; CN + CC; CN + CC + DD). Finally, to evaluate which terms were most in uential in identifying each syndrome, the frequency of terms present among true-positive case records were measured and visually assessed using word clouds (NB: select terms may have been double-counted since some concept terms may exist in both CN and DD algorithms).

Results
Selection of syndromes for the ED-SyS: An overview of the Yukon ED-SyS is presented in Fig. 1. In total, seven syndromes of interest were selected for daily monitoring; these included: gastrointestinal illness (GI), in uenza-likeillness (ILI), mumps, neurological infections (Neuro), rash, respiratory illnesses (Resp), and COVID-19. For each syndrome, diseases/conditions of interest were identi ed, and case de nitions were developed based on algorithms that query information from the WGH ED-Tracker database (see Table 1). A complete list of terms used to query each syndrome is available in supplemental material (Supplement A1).
ED-SyS validation and performance: The merged dataset of ED visit data from the ED Tracker and NACRS contained a total of 19,023 unique ED visits between October 1, 2018 and April 30, 2019. After applying the initial case de nitions to this dataset, GI and Resp were the only two syndromes that greatly exceeded our target case sample of approximately 500 hits for validation. For these two syndromes, a subset of ED visits from November 1, 2018 to January 31, 2019 was used for validation (n = 8246 records). In total, our original case de nitions agged 3,707 ED visits as potential cases for all syndromes (1,793 for GI; 966 for Resp; 593 for ILI; 64 for Rash; 234 for Neuro; 57 for Mumps; and 0 for COVID -19). As no records were agged for COVID-19, the syndrome was not included in validation procedures.
Results from our validation are presented in Table 2. Among our initial case de nitions, DD consistently returned the highest proportion of true positive cases (PPV: 51.3-100%), compared to CN (PPV: 22.8-86.1%) and CC (PPV: 0-35.0%) when used individually. In general, the CN eld produced the most sensitive results, agging the highest number of visits for all syndromes, while CC and DD elds provided a lower number of hits with higher speci city. These trends were maintained after adjustment to the initial case de nitions, with the largest improvements observed in the GI syndrome (e.g., PPV for DD improved from 51.3-86.9%).
Combining multiple levels of data from ED-Tracker produced an average of 6.7-fold more hits to each syndrome than querying each component individually ( Table 2). The largest example of this was observed when querying for visits related to the Resp Syndrome; using CC alongside CN increased hits from our original case de nitions over 33-fold. Using our nal case de nitions for the combined elds produced the largest improvements to the GI Syndrome, with the PPV for the CN/CC combination increasing to 44.5% from 22.3% and the CN/CC/DD combination increasing to 78.8% from 48.8%. Changes observed among the other syndromes were minimal, save for the PPV of detecting the Mumps Syndrome, which increased from 50.9-94.1%, although the total number of observed hits decreased from 57 to 17.
Evaluating terms and logic used to identify syndromes: After an initial review of the ED-SyS performance using our validation dataset, several adjustments were made to the terms and algorithms used in our ED-SyS to improve the system's performance within the local context. In general, three areas were useful in rede ning ED-SyS queries to provide more accurate results in the Yukon setting (1) Misspellings or shorthand among the free-text elds that were not considered during the development of the initial algorithms. For example, the acronym "LWBS" in the DD eld was often used to indicate a patient had "left without being seen", and additional acronyms were needed to describe the diarrhea, nausea, and vomit concepts within the GI syndrome case de nition including "N&V" (nausea and vomiting), "N, V, D" (nausea, vomiting, diarrhea), and "V/D" (vomiting/diarrhea).
(2) Terms and CEDIS code that could be used to further re ne the inclusion or exclusion of true-and false-positives, respectively. For example, additional negating terms were identi ed for several syndromes; the inclusion of "no", "denies" and "-" were therefore added to indicate negation within the diarrhea, nausea, vomit, fever, and cough concepts. Several CEDIS codes were consistently present among true-and false-positive results for each syndrome; we used these codes to help provide additional speci city to our inclusion and exclusion criteria based on local context (Table 3). For example, a high proportion of visits agged by our GI case de nition presented with abdominal pain (CEDIS 251), but were validated as unclear/con icting information (n = 91/571, classi cation = "1") or unrelated (n = 365/571, classi cation = "2"). For this reason, we elected to remove CEDIS 251 from the nal GI syndrome case de nition in addition to several related terms that were present in the original syndromic case de nition at the CN and DD level ("ascites", "LUQ", "LLQ", "RLQ", "RUQ", "diverticulitis", "appendicitis", "abdominal bloating", and " atus") and were also associated with false-positive records.
(3) Adjustments to the algorithm logic that could improve the application of our algorithms. For example, the original algorithm for detecting the Neuro syndrome queried the CN and CC elds only if the information from the DD eld was blank. We modi ed this process to query CN and CC when the DD eld contained at least one of the terms "febrile seizure," "headache nyd," "malaise nyd", or "sepsis" without mention of the term "mening". This improved both the sensitivity and speci city of the algorithm, increasing the number of hits from six to 19, while concomitantly increasing PPV from 83.3-89.5%.
After completing the above adjustments to improve case-nding, we visualized the most in uential terms among the true-positive case records for each syndrome (Fig. 2). Each syndrome appeared to have several terms that were essential to identifying the syndrome of interest. In three syndromes (Rash, Resp, and ILI) CEDIS codes appeared among the most frequent terms.

Discussion
Our ED-SyS is the rst automated system to be implemented in Yukon, and, to our knowledge in northern Canada.
In preparation for the 2020 AWG, we developed an ED-SyS that included six syndromes, with a seventh syndrome created in response to an ongoing global pandemic. While the AWG was ultimately canceled, the preparations allowed us to later activate the ED-SyS for enhanced surveillance during the COVID-19 pandemic, thus highlighting the exibility and utility of these types of systems in settings like Yukon where surveillance capacity may otherwise be limited.
Results from our validation indicated that for each syndrome tested, the CN eld generally provided the most sensitive results, whereas the DD eld was most speci c. These results were not surprising to us, as the information stored in each eld tended towards speci city of a diagnosis moving from CN to CC and DD. Our ndings also highlight the value of leveraging the three elds together to improve case detection. This nding is noteworthy as a signi cant number of cases could be missed without considering one or more elds. This was particularly evident with our respiratory and ILI syndromes, where a signi cant number of cases could be missed without using the CN eld. Our results for the performance of individual elds towards the overall surveillance system should be interpreted with caution, however, as many of the built algorithms were intended to pull information from CC and CN data only in the absence of a con rmed case or non-case from the DD eld. For a more accurate assessment of the performance of each eld, algorithms should be structured with the intent to use information uniquely from each data source, instead of in a combined fashion.
A useful step in designing our ED-SyS was ensuring that we allotted time for re ning the system using local data.
The original iteration of our ED-SyS was built using terms and case de nitions informed by relevant literature, which did not capture the range of nuances inherent in our local data source. By reviewing and validating records on a case-by-case basis, we were able to identify additional terms and patterns of chart record-keeping that allowed us to make adjustments to the syndromic case de nitions that improved the predictive ability of the ED-SyS. We found the CEDIS terms used in the CC eld were especially useful for exploring false-positive cases agged in our results. Having a standard terminology allowed us to group false positive cases from each syndrome by their CEDIS term and explore whether it was necessary to build additional exclusions into our syndromic case algorithms. This was not as feasible with either of the CN or DD elds, as they both contained free-text input, which proved much more variable than the CEDIS-codes and their standardized terms.
We were motivated to establish an ED-SyS to not only support surveillance activities during the 2020 AWG, but also enhance Yukon's surveillance infrastructure for detection and response capabilities for years to follow. Throughout all stages of development, we found it essential to leverage both local knowledge of contextual issues and disease risk and existing research to establish and re ne our syndromic case de nitions and algorithms. Partnerships between WGH, OCMOH and the Public Health Agency of Canada greatly contributed to the timely design and implementation of the ED-SyS. Future development of the ED-SyS will include expanding to the syndromic surveillance of other public health issues in Yukon, including opioids, cannabis, forest res, and secondary health impacts of the COVID-19 pandemic.

Conclusions/ Practice Implications
Our study highlights the feasibility of implementing an automated ED-SyS and validating syndrome case de nitions in a low-resourced/remote setting using simple tools, resources, and adapted gold standard methods.
Our approach to developing an ED-SyS allowed us to move away from "drop-in" paper-based methods to create a "low-tech" sustainable system that can be leveraged for other mass-gathering events, other emerging health issues of concern, and general ongoing surveillance. Our study also reinforces the importance and value of validating syndrome case de nitions using local data. Importantly, our study provides a path forward for other lower- (http://www.gov.yk.ca/legislation/acts/hipm_c.pdf), this study was recognized as "public health surveillance" work and as such it governed the collection, use, and disclosure of the associated data. Submission and approval by Yukon's Health Research Review Committee was therefore not required. No formal IRB body operates in the Yukon. A privacy impact assessment (PIA) was required as part of routine Yukon Government processes to outline the relevant authorities under the HIPMA legislation and data safeguards. The PIA was reviewed and approved by

Competing interests
The authors declare that they have no competing interests.

Funding
The authors wish to acknowledge funding from the Public Health Agency of Canada to provide a Field Epidemiologist to support the analysis and writing of this project.
Authors' contributions EB designed and led the study and wrote the original code for the surveillance system. EB and BMH performed validation, analysis and interpretation and drafted and edited the manuscript. BH contributed to study design, provided consultation and oversight, and contributed to proofreading and editing the manuscript. All authors read and approved the nal manuscript