In this retrospective cross-sectional time series study, we linked data from six state-level sources to compare staffing characteristics and patient outcomes for hospitals across Washington. Throughout the study, we used feedback from key stakeholders to refine the selection of participants, data sources, analysis, and outcomes.
Sample
We identified hospitals using license numbers and data in the 2021 Comprehensive Hospital Abstract Reporting System (CHARS) directory from the WA Department of Health. Eligible hospitals were Washington acute care hospitals (including CAHs) in operation for at least part of the study period (2016–2019). Exclusion criteria were identification as a specialty hospital (psychiatric, rehabilitation, cancer care, or pediatric) or wing such as an extended care unit. We chose the study period based on stakeholder feedback that the years before the COVID pandemic needed to be examined separately from the years after the onset of COVID. We conducted a complete case analysis by excluding all data points for which any of the model variables were missing.
Data sources and processing
During interviews and a townhall meeting, we asked key stakeholders to provide input on existing state level data sources and the strengths and limitations of those sources, then we screened available sources and selected based on data compatibility and applicability. Sources included the CHARS, individual hospitals’ year-end reports to the Department of Health, adverse event data from the Department of Health, nursing licensure information from the Board of Nursing (formerly the Nursing Care Quality Assurance Commission), and other publicly available records. See Supplement 1 for a complete listing of sources and variables. The standard unit of measurement was a hospital-year, which consists of the merged data containing all model variables pertaining to a single hospital for a single year. Each data source required processing to link it to other data. For example, year-end reports contained cost centers specifying nursing and non-nursing full time equivalents (FTEs). We aggregated these data to indicate the total number of nursing or other care team FTEs in a facility for each hospital-year. Nursing licensure data includes reports of RNs’ employers’ zip codes, so we matched these with the unique organization corresponding to that zip code, then aggregated data to determine model variables for education and experience in a given hospital. CHARS data include complex patient-level information based on discharges and required the most processing to assign outcomes and aggregate and measure covariates per patient-days and hospital-years.
Outcomes
The primary outcome of interest was the number of serious adverse events, which we defined to include reportable pressure ulcers, falls, and medication errors. While stakeholders identified multiple pertinent patient outcomes in the model (Fig. 1), adverse events are commonly used as care quality indicators related to staffing and nursing care.[13] We identified the number of events for each hospital year by aggregating data from state-level reports which follow the National Quality Forum 2011 definitions of reportable events.[14] Adverse events represented in this study include stage 3, stage 4, or unstageable pressure ulcers acquired after admission to the healthcare setting and patient death or serious injury associated with either a medication error or a fall during inpatient admission. The outcome variable we used in our model was the total number of these three specific types of adverse events reported by each hospital during each year of the study.
Exposure variables
Our four exposure variables corresponded to four of the staffing characteristics requested by the legislature: number, type, education, and experience. Our variable for staff number was the patient-to-staff ratio, defined as the total number of inpatient hours recorded in CHARS divided by the number of care team staff hours converted from FTE as recorded in the year-end cost center reports. Care team staff consisted of acute care nursing staff plus core clinical and nonclinical staff. Nurses on the research team (SI) and among the key stakeholders collaborated to determine these categories based on examination of available cost centers (Supplement 2). Our other three exposure variables were the ratio of RN FTEs to all care team staff FTEs (staff type), the fraction of RNs with less than 5 years since licensure (experience), and the fraction of RNs without a Bachelor of Science in Nursing (BSN) (education).
Control variables
Our desk review, key informant interviews, and town hall feedback identified a complex causal web of factors that might confound the relationship between staffing and outcomes; the following briefly describes the covariates we included to control for this within the model categories (Fig. 1.).[8, 11] Hospital characteristics included trauma level designation, percent of patients transferred in from other hospitals, and certification for open heart surgery, which can be used as a proxy for equipment and technology.[5] Patient characteristics included demographics such as age, sex (recorded as female, male, or unknown by hospital standards[16]), race (recorded per Office of Management and Budget standards[15]), and Medicaid insurance status. Other patient characteristics included the hospital’s case mix index for the year, the average number of diagnoses in a hospital-year, the percentage of patient-days with an obstetrics procedure (ICD-10-PCS code starting with 1), and the percentage of patient-days with a high care intensity diagnosis or procedure code. Based on stakeholder feedback, we defined high care intensity diagnoses as dementia, self-harm, opioid use, drug dependency, obesity, traumatic brain injury, heart failure, and asthma using ICD-10-CM diagnosis codes in CHARS. Stakeholders reported that diagnoses alone were insufficient indicators of care intensity, so we worked with several nurses to review the 500 most common ICD-10-PCS procedure codes in CHARS and rated them as likely leading to higher intensity care versus usual care intensity (Supplement 3).
Finally, we conducted separate analyses for CAHs and general acute care hospitals (ACHs), which we defined to be any eligible acute care hospital that was not designated as a CAH in the CHARS directory. This decision was based on stakeholder feedback that the vastly different staffing models and requirements for CAHs[10] meant that staffing did not vary as widely in those settings as in ACHs.
Statistical analyses
For each category of hospital (ACH, CAH), we fit a hierarchical Bayesian Poisson regression model predicting the number of adverse events in each hospital-year as a function of our exposure and control variables. We obtained little data for hospital-level control variables, so we included hospital-level random effects to account for similarities between measurements at the same hospital in different years. The model has the following form:
with priors
$$\begin{array}{rrrr}\alpha & \sim \text{N}\text{o}\text{r}\text{m}\text{a}\text{l}\left(0,500\right),& {u}_{h}& \sim \text{N}\text{o}\text{r}\text{m}\text{a}\text{l}\left(0,{\sigma }^{2}\right),\\ {\beta }_{i}& \sim \text{N}\text{o}\text{r}\text{m}\text{a}\text{l}\left(0,10\right),& \sigma & \sim \text{H}\text{a}\text{l}\text{f}\text{N}\text{o}\text{r}\text{m}\text{a}\text{l}\left(100\right).\end{array}$$
Here, \(h\) indexes hospitals, \(y\) indexes years, and \(i\) indexes the explanatory variables in the model (exposures plus controls). \(X\) is the design matrix, with each column (indexed by \(i\)) corresponding to an explanatory variable, and each row (indexed by \(\left(h,y\right)\)) corresponding to one hospital-year of data. Hospital-level random effects are represented by \({u}_{h}\). The model estimates values for \(\sigma\), \({u}_{h}\), \(\alpha\), and the vector \(\beta =\left({\beta }_{1},\dots ,{\beta }_{K}\right)\), where \(K\) is the total number of explanatory variables.
The model assumes a log-linear relationship between the explanatory variables and the expected rate of adverse events per patient-day. In particular, \(\text{e}\text{x}\text{p}\left({\beta }_{i}\right)\) is the proportional increase in the expected adverse event rate when exposure variable \(i\) increases by one unit; we call \(\text{e}\text{x}\text{p}\left({\beta }_{i}\right)\) the adjusted rate ratio per unit increase in exposure variable \(i\). Because of the structure of our causal diagram, this interpretation as an adjusted rate ratio applies to the exposure variables (staffing characteristics), whose effects have been adjusted for confounding as much as possible given our data, but not to the control variables (hospital and patient characteristics), which may still have confounding present.
To estimate the unadjusted rate ratio for each exposure variable, we fit a univariate Poisson regression of the same form but without random effects. This form of the model estimates \(\alpha\) and a single \({\beta }_{i}\), which can be exponentiated to obtain the unadjusted rate ratio for the corresponding exposure variable.
All regressions were done using NumPyro version 0.10.0 with Python 3.9.12. All explanatory variables were standardized to have mean 0.0 and standard deviation (SD) 1.0 before fitting the regression models. Thus, each \(\text{e}\text{x}\text{p}\left({\beta }_{i}\right)\) value from each model is an estimate of the increase in adverse event rate ratios per one SD increase of the corresponding exposure variable.
Sensitivity and counterfactual analyses
To confirm that the findings were robust to the sample of hospitals, we conducted out-of-sample validation. We re-ran our analyses 10 times with 10% of the hospitals randomly removed from the data. This resulted in no substantive change in our findings. We also completed a counterfactual analysis to present findings to stakeholders in more meaningful terms. We used our fitted random effects Poisson regression to predict how many adverse events would be averted in a hypothetical scenario in which all ACHs had the median patient-to-staff ratio or lower (with all other variables remaining the same). We chose the median of the observed data as a level that could be realistically achieved and that fell within the range of our models’ predictions.