Setting and participants
Old Town Clinic (OTC) is an FQHC operated in Portland, Oregon by Central City Concern, a community-based nonprofit that provides comprehensive solutions to ending homelessness and achieving self-sufficiency.11 OTC serves about 6,000 low-income adult patients with high rates of homelessness, mental health conditions, and substance use disorders across five integrated primary care teams.12
Overview of Approach
This study arose as part of a broader strategic planning exercise to identify needs of OTC’s population and guide decision-making to better tailor health services for patients. We integrated stakeholder feedback alongside clustering methods in development and identification of clinical subgroups. Administrators (RS, MS), data analytics and research personnel (MM, BC), and clinicians met regularly to outline priorities for this project and provide feedback on model development. We used LCA to initially define groups, then presented these groups for stakeholder feedback and revision in an iterative process. The results led to creation of a final neural network-based classification model.
Data collection and measures
Patient data were derived from the EHR, including demographics, diagnoses, health program enrollments, and other clinical and billing data. We retrospectively identified all primary care patients enrolled at OTC on December 1, 2017, defined as having at least one visit in the previous two years. We extracted hospitalization and emergency department visit data from Oregon’s Emergency Department Information Exchange (EDIE) system.13 EDIE is a web-based platform that provides access to real-time hospital utilization data and has been broadly adopted by hospitals and health systems in Oregon, Washington, and northern California. In partnership with clinicians, data analysts and researchers, we developed a list of potential variables that might separate patient groups (referred to as classifying variables). We abstracted data for 75 days after December 1, 2017 to ensure that documentation and records were complete.
Variable selection
We considered demographic, diagnostic and utilization variables in our analyses. Our continuous classifying variables—age, total medical hospitalizations, psychiatric hospitalizations, and emergency department visits in the prior 365 days—were grouped into bins. Age was categorized as: 18–29, 30–39, 40–49, 50–59, and 60 and above. Medical and psychiatric hospitalization were each categorized as: 0, 1, 2, and 3 or more. Emergency department visits were categorized as: 0, 1, 2–3, 4–6, and 7 or more. We included bivariate variables (yes/no) for diagnoses of asthma or chronic obstructive pulmonary disease, diabetes, congestive heart failure, hepatitis C, chronic kidney disease, chronic liver disease, severe head injury, psychotic disorders, bipolar disorders, depressive disorders, trauma-related disorders, alcohol use disorders, opioid use disorders, and stimulant use disorders. Using the ICD10 diagnoses recorded in the patient problem list at the beginning of the study period, we grouped physical health diagnosis according to hierarchical condition categories 14,15 and behavioral health diagnosis according to the Diagnostic and Statistical Manual.16 We include a table of ICD-10 codes used for diagnosis identification in Appendix 1. We included binary variables for gender and the presence of any monthly income as well as categorical variables for race and housing status, each pulled from EHR registration data as of the date of each patients’ last visit. We considered housing statuses of sleeping on the street or in emergency shelters to be homeless.
Data analysis
We used latent class analysis (LCA) to identify an initial set of subgroups of patients seen at OTC. LCA is an unsupervised machine learning tool that identifies unobserved subgroups of patients in a larger population.17 We hypothesized that a useful model would have between 15 and 30 classes based on previous quality improvement efforts.
To evaluate class fit for the latent class analysis, we compared models using the Akaike information criterion (AIC), Bayesian information criterion (BIC), Log-likelihood, G-squared, and entropy values for each model. We prioritized AIC values in selecting potentially appropriate models to describe our population.18–20 We planned to select the four models with the lowest AIC for the stakeholder feedback phase of model creation. Data analysis was conducted in 2018.
Stakeholder Group Qualitative Validation
To improve chances of identifying clinical meaningful groups for population health management and strategic planning, we sought to incorporate feedback on our models from end-users. Over the course of a month, we engaged an inter-professional team composed of physicians, social workers, substance use counselors, social service staff, quality management professionals, and administrators in a codesign process. The team of 12 individuals represented a cross-section of disciplines, consisting of staff who would use the final classification model in clinical practice, population health management, and strategic planning. In two initial design sessions, the team identified prospective variables for the analysis and envisioned possible applications of a classification model.
After performing the LCA, we developed a sorting activity to engage the inter-professional team in qualitative analysis of the four candidate models. For the activity, we listed the characteristics of each class of the candidate models on index cards, totaling 110 cards. In small group and individual activities, we asked team members to organize the cards on a table to represent subgroups that matched the patients they served. We asked team members to narrate as they arranged cards to give the research team insight into their thought processes. Once the cards were sorted at the end of the activities, we took photographs of the cards to record the results (Appendix 2). We integrated feedback from the activity to develop the final classification model by identifying recurring clusters of cards and synthesizing key insights from the narration to create the final set of classes. For example, some participants grouped a card from one model representing a class over age 50 with high prevalence of diabetes, heart disease, and kidney disease with a card from another model representing a class over age 50 with high prevalence of depression, COPD, diabetes, and heart disease. Despite the fact that the high prevalence conditions from these cards were somewhat different, participants identified the cards as representing the same real-world class. Although no participants grouped the cards identically, patterns emerged with common groupings by age, behavioral health conditions, and higher numbers of physical conditions. Multiple classes from the latent class analysis were deemed uninformative by the inter-professional team, which led to the consolidation of multiple classes. Based on common patterns and key insights from a few participants, the analysts (MM, MS) synthesized the results from the card sorting activity into a set of proposed classes, which were reviewed with the participants before the developing the final model.
Based on the synthesis of the card sorting activity and feedback from a review with the inter-professional team, we identified a final set of classes and created a simulated dataset for training a neural network model. The data simulation used observed distributions of patient characteristics as the starting point, and we adjusted the distributions based on the stakeholder feedback from the sorting activity. We included all of the variables from the latent class analysis in the simulated data except for race, gender, and housing status. After a preliminary review of the analysis, the inter-professional team decided that race, gender, and housing should not be used to determine class membership—instead these characteristics should be explicitly considered as a part of population health management for all patients at the FQHC. We trained the final classification model on the simulated data using a feed-forward neural network with two hidden layers and L2 regularization, withholding 25% of the data as a testing set. We then predicted class membership of the original OTC population using the final model. Although the initial model output is probabilistic, we report a single class for each patient based on the maximum posterior probability.
We used the R statistical programming language version 3.5 for all analyses.21 The poLCA package 22 was used to conduct latent class analyses and keras was used to train the final model.23