Identifying inpatient hospitalizations with continuous electroencephalogram monitoring from administrative data

Background Continuous electroencephalography (cEEG) is increasingly utilized in hospitalized patients to detect and treat seizures. Epidemiologic and observational studies using administrative datasets can provide insights into the comparative and cost effectiveness of cEEG utilization. Defining patient cohorts that underwent acute inpatient cEEG from administrative datasets is limited by the lack of validated codes differentiating elective epilepsy monitoring unit (EMU) admissions from acute inpatient hospitalization with cEEG utilization. Our aim was to develop hospital administrative data-based models to identify acute inpatient admissions with cEEG monitoring and distinguish them from EMU admissions. Methods This was a single center retrospective cohort study of adult (≥ 18 years old) inpatient admissions with a cEEG procedure (EMU or acute inpatient) between January 2016-April 2022. The gold standard for acute inpatient cEEG vs. EMU was obtained from the local EEG recording platform. An extreme gradient boosting model was trained to classify admissions as acute inpatient cEEG vs. EMU using administrative data including demographics, diagnostic and procedure codes, and medications. Results There were 9,523 patients in our cohort with 10,783 hospital admissions (8.5% EMU, 91.5% acute inpatient cEEG); with average age of 59 (SD 18.2) years; 46.2% were female. The model achieved an area under the receiver operating curve of 0.92 (95% CI [0.91–0.94]) and area under the precision-recall curve of 0.99 [0.98–0.99] for classification of acute inpatient cEEG. Conclusions Our model has the potential to identify cEEG monitoring admissions in larger cohorts and can serve as a tool to enable large-scale, administrative data-based studies of EEG utilization.


Abstract Background
Continuous electroencephalography (cEEG) is increasingly utilized in hospitalized patients to detect and treat seizures. Epidemiologic and observational studies using administrative datasets can provide insights into the comparative and cost effectiveness of cEEG utilization. De ning patient cohorts that underwent acute inpatient cEEG from administrative datasets is limited by the lack of validated codes differentiating elective epilepsy monitoring unit (EMU) admissions from acute inpatient hospitalization with cEEG utilization. Our aim was to develop hospital administrative data-based models to identify acute inpatient admissions with cEEG monitoring and distinguish them from EMU admissions.

Methods
This was a single center retrospective cohort study of adult (≥ 18 years old) inpatient admissions with a cEEG procedure (EMU or acute inpatient) between January 2016-April 2022. The gold standard for acute inpatient cEEG vs. EMU was obtained from the local EEG recording platform. An extreme gradient boosting model was trained to classify admissions as acute inpatient cEEG vs. EMU using administrative data including demographics, diagnostic and procedure codes, and medications.

Conclusions
Our model has the potential to identify cEEG monitoring admissions in larger cohorts and can serve as a tool to enable large-scale, administrative data-based studies of EEG utilization.

Background
Continuous electroencephalography (cEEG) is increasingly utilized in hospitalized patients with acute brain injury or altered mental status to detect seizures and other seizure-like patterns that can worsen outcomes (Herman et al. 2015). In the United States, there has been a 10-fold increase in the use of cEEG in acute inpatient setting, particularly in critical care (Hill et (Sivaraju and Gilmore 2016). At the same time, cEEG is resource intensive with limited availability in smaller health care facilities, being utilized more frequently in larger, urban and academic centers (Hill et al. 2019;Ney et al. 2013). Epidemiologic studies and observational studies using large administrative datasets can provide insights into the comparative effectiveness and cost effectiveness of cEEG utilization in acutely ill patients, and guide policies and protocols that can improve access to cEEG for patients where indicated (e.g., identifying patients that may bene t most from transfer to centers performing cEEG), develop cEEG utilization quality measures, generate evidence for rigorous randomized trials on cEEG guided anti-seizure treatment, and ultimately improve outcomes.
Prior work examining administrative datasets has shown that cEEG monitoring in hospitalized critically ill patients is associated with lower in-hospital mortality (Hill et al. 2019;Ney et al. 2013). However, a limitation of prior studies that have used administrative datasets is the lack of validated codes differentiating elective epilepsy monitoring unit (EMU) admissions from acute inpatient hospitalization with cEEG utilization. Acute inpatient cEEG and EMU EEG have the same International Classi cation of Diseases (ICD) and Current Procedural Terminology (CPT) codes. As a result, prior work has excluded all patients that were elective admissions or were not mechanically ventilated to de ne patient cohorts that underwent acute inpatient continuous EEG monitoring, resulting in potential selection bias. The aim of this study is to develop hospital administrative data-based models to identify acute inpatient admissions with cEEG monitoring.

Study cohort
In this study, we conducted a retrospective analysis of adult patients (≥ 18 years old) admitted to a single center between January 1st 2016 and April 30th 2022. The research protocol was approved by the Mass General Brigham (MGB) Institutional Review Board and a waiver of informed consent was obtained. The selection of patients for our cohort was performed considering the aim of the study in identifying acute inpatient admissions with cEEG monitoring. Figure 1 shows the patient selection ow chart. Patients were included if they underwent cEEG monitoring (either in the EMU or as part of an acute inpatient hospitalization). cEEG monitoring was de ned using long-term EEG monitoring ICD10th revision and CPT codes, (Table A1 from the Additional File).

Study outcome variables
Our study outcome consisted of a binary variable indicating whether an inpatient admission with a cEEG procedure was performed in the acute inpatient hospital setting (cEEG) or in the EMU setting (EMU). From here on "cEEG" will refer to acute inpatient admissions with continuous EEG monitoring, and "EMU" will refer to epilepsy monitoring unit admissions. The gold standard for the procedures was de ned using the hospital EEG recording platform.

Study covariates
The study covariates for the hospital admissions in our study cohort are presented in Table A2 from the Additional File. Diagnoses and procedures were de ned using ICD and CPT codes and are presented in Table A1 from the Additional File. The binary covariates considered were indication ('1' for presence and '0' for absence) of daily laboratory values acquired, inpatient medications ordered, procedures performed, type of admission -elective, emergency and urgent, primary and secondary diagnoses of traumatic brain injury (TBI), stroke and epilepsy, seizures or convulsions, death at discharge, discharged to home or selfcare and female sex. The numerical covariates consisted of the number of distinct procedures, number of distinct medications, days of hospital length of stay (LOS) and age at admission. Numerical covariates were normalized using the min-max normalization (Han, Pei, and Kamber 2011) where the minimum and maximum reference values for each covariate were calculated from a training set. The data splitting into training and testing sets is detailed in the following section. Regarding outliers preprocessing, we identi ed one outlier for hospital LOS, which we imputed with the median LOS.
The procedures (Table A1 from the Additional File) considered were the following: abdomen/pelvis computerized tomography (CT) scan, arterial line, chest X-ray, head CT scan, lumbar puncture, magnetic resonance imaging (MRI), mechanical ventilation, transthoracic echocardiogram, and tube feed orders. The number of procedures consisted of the sum of the distinct procedures performed during the hospital stay, varying in the range between zero and nine.
The set of inpatient medications considered were the following: cefepime, ceftriaxone, dexmedetomidine, dobutamine, dopamine, enoxaparin, epinephrine, heparin, midazolam, nicardipine, norepinephrine, phenylephrine, piperacillin, piperacillin/tazobactam, propofol, vancomycin and vasopressin. The number of medications consisted of the sum of the distinct inpatient medications ordered during the hospital stay, varying in the range between zero and seventeen.

Modeling design and evaluation
We performed a random sampling of hospital admissions in our cohort to create training (70%) and holdout testing (30%) sets with distinct patients. With the training set we developed an extreme gradient boosting model (XGBoost) (Chen and Guestrin 2016) and performed hyperparameter tuning in 100 iterations of 10-fold cross validation. The hyperparameter tunning methodology is described in Additional File section A.2. We selected a threshold for binary classi cation on the training data that achieved a positive predictive value (PPV) yielding a balance between false positives and false negative predictions. We assessed both the positive and negative predictive values (PPV and NPV, respectively). We evaluated model performance using the area under the precision recall-curve (AUPRC) (Saito and Rehmsmeier 2015), showing the trade-off between PPV and sensitivity, also called true positive rate or recall, for different thresholds. We also evaluated the receiver operating characteristic (AUROC), which quanti es the tradeoff between sensitivity and false positive rate (also known as 1-speci city), across different decision thresholds (Steyerberg et al. 2010). Given the imbalance in our dataset, we present the macro average (Sokolova and Lapalme 2009) performance for the classi cation, and the performance for each class (EMU vs cEEG). A macro-average calculates performance metrics independently for both classes and then takes the average, giving both classes equal weight (Sokolova and Lapalme 2009). We performed 1000 bootstrapping iterations to calculate 95% con dence intervals (CI) in the hold-out test set, an external and independent test set not used for model training or validation. We assessed covariate importance using SHAP (SHapley Additive exPlanations) (Lundberg and Lee 2017), which estimates the contribution of each feature to the model's predictions.

Cohort characteristics
Our cohort comprised 9,523 patients and a total of 10,783 hospital admissions, after applying inclusion and exclusion criteria (Fig. 1). The average age of the cohort was 59 years (standard deviation (SD) 18.2), with the majority being males (53.8%), White (75.5%) and non-Hispanic (82.7%) ( Table 1). The majority of admissions (91.5%) were acute inpatient hospitalizations (i.e., cEEG rather than EMU). Demographic characteristics were approximately the same at the hospital admission level (   Modeling results The XGboost model was trained with all study covariates described in the Methods section (Table A2 from the Additional File). Model performance evaluated on the testing set is presented in Table 2. The hyperparameters selected during training in 10-fold cross validation are presented in Table A3 from the Additional File. We experimented different thresholds for xed values of PPV between 95% and 98% (  Figure A1 from the Additional File. There were 130 (4%) misclassi cations of acute inpatient cEEG incorrectly classi ed as EMU admissions, and 73 (26.6%) misclassi cations of EMU admissions incorrectly classi ed as acute inpatient cEEG, as presented in Fig. 2. When analyzing misclassi cations, 130 cEEG admissions incorrectly classi ed as EMU, we observed that 79% (N = 102) of admissions were elective and 90% (N = 117) discharged to home or self-care. For the EMU admissions incorrectly classi ed as acute inpatient cEEG we observed that 71% (N = 52) were emergency and 93% of the admissions (N = 68) had daily laboratory values acquired, a higher proportion when compared with that of the EMU class (59% , Table A2 from the Additional File). Legend: AUROC -area under the receiver operating characteristic curve; AUPRC -area under the precision-recall curve; cEEG -acute inpatient hospitalizations admissions class; EMU -epilepsy monitoring unit admissions class; PPV -positive predictive value; NPV -negative predictive value.
Since most EMU admissions in our cohort are elective (72.4%) and discharged to home or self-care (82.7%), (Table A2 from the Additional File), we trained a model excluding both the admission type covariates (emergency, urgent and elective) and discharge disposition to home or self-care and evaluated the performance in test (AUROC and AUPRC presented in Figure A2 from the Additional File). The overall macro average model performance (

Covariates importance
We analyzed the importance of the covariates in the design of the XGBoost model. The average magnitude of the SHAP values for the 20 top features is presented in Fig. 3, and the SHAP raw values are presented in Figure A3 from the Additional File.
Elective admission was the most important covariate for the model to classify an admission as EMU (left side in Figure A3 from the Additional File). We sought to understand the type of admissions distribution for each class, since EMU admissions are frequently elective. For both train and test sets, 72% of EMU and 11% of acute inpatients (cEEG class) admissions were elective, respectively. Furthermore, we assessed if the EMU non-elective admissions were correlated with the COVID-19 pandemic. According to a study (Zepeda et al. 2021), in the setting of the COVID-19 pandemic, urgent and emergent EMU admissions were required due to increased seizure or event frequency. We con rmed that the non-elective EMU admissions spanned all years of our study period with the following number of admissions per year: 39, 46, 48, 41, 31, 38, 9, from 2016 to 2022, respectively. Since not all EMU admissions were elective, it was important to combine this covariate with others to develop the classi cation model.
Emergency admissions, orders of medications such as heparin, vasopressin, cefepime or epinephrine, daily laboratory values acquired, mechanical ventilation, transthoracic echocardiogram and chest X-ray and also diagnosis of stroke were important predictors of acute inpatients hospitalizations (cEEG).
The EMU class was associated with younger age at admission (blue color in Figure A3 from the Additional File) when compared to the cEEG class (mostly pink in Figure A3 from the Additional File). The average age (SD) for EMU admissions was 42 (18) while for the cEEG class it was 60 (18), a difference of approximately 20 years (Table A2 from the Additional File). The EMU admissions class was also associated with lower number of procedures and medications, when compared to the cEEG class. EMU admissions were also associated with being discharged home, having a diagnosis of epilepsy or seizures, orders of enoxaparin and ceftriaxone. Patients receiving ceftriaxone along were likely intracranial monitoring admissions.

Discussion
Our model using hospital administrative and billing data distinguishes continuous EEG performed in acute inpatient setting from the EMU setting and can enable comprehensive comparative effectiveness and cost effectiveness analysis of continuous EEG utilization in the acute and critical care setting. Such large epidemiologic studies can then provide further guidance for randomized trials of continuous EEG guided anti-seizure treatment in the acute setting, and re nement of continuous EEG guidelines and protocols, particularly for resource limited settings.
There has been limited prior work in the development and validation of administrative models for accurate identi cation of continuous EEG in the hospital setting from administrative datasets. One prior study evaluated ICD based models for accurate identi cation of EMU admissions from administrative datasets (Kamitaki et al. 2020 The main limitation of the study is that it is a single center study, therefore may not be generalizable. However, we used standard billing and procedure codes, medications and admission data that are routinely available in most large commercially available inpatient datasets and in hospital administrative datasets. Other covariates could have been included in the model, such as free text clinical notes, including EEG reports, which we propose as future work, along with validation in other administrative datasets.

Conclusions
The model developed in this study can identify continuous EEG performed in the acute inpatient setting from continuous EEG performed in the EMU setting and reduces the number of misclassi cations. This model will allow the identi cation of continuous EEG monitoring admission in larger cohorts, thereby contributing to the scale of research of EEG utilization.

Declarations
Ethics approval and consent to participate We con rm that all methods were carried out in accordance with relevant guidelines and regulation. The research protocol was approved by the Mass General Brigham (MGB) Institutional Review Board and a waiver of informed consent was obtained.

Consent for publication
Not applicable.

Availability of data and materials
We will made our de-identi ed data and code available in a public GitHub repository for reproducibility, upon article acceptance for publication.