Data Source
PRIORity (Predicting Risk and Investigating Outcomes using Patient-Reported and Community-level Social Determinants Data in Vulnerable Populations) is a prospective observational cohort study of randomly selected, high-risk (at least one chronic medical condition) adult Emergency Department (ED) patients who reside in New York City (New York City). A 36-item survey was administered by research staff, including participants’ address. (Survey is available as Appendix—Priority Enrollment Survey). For purposes of our example, we use self-reported household income and self-reported race/ethnicity from this cohort.
Definition of neighborhood
Neighborhood was operationalized as a walkable distance along pedestrian accessible networks in a ¼ mile radius around each participant’s address. This definition is a more realistic and nuanced description of “neighborhood” that is more reflective of a participant’s lived experience than more commonly used definitions such as a simple Euclidean buffer (e.g., a circle with a ¼ mile radius) or simple containment (the census tract or ZIP code in which the home address is located).(8)
Calculation of neighborhood variables
All spatial methods were performed with ArcGIS Pro 3.0 (ESRI, Redlands, CA) and other methods with RStudio.(9) Of the 150 survey participants, two (1.3%) did not have a valid home addresses in NYC resulting in an analytic sample of 148 which were geocoded using ESRI’s world geocoding services. American Community Survey 2020 5-year estimates of median household income (MHHI) and race/ethnicity by census block group (CBG) in NYC were acquired from Integrated Public Use Microdata Series (IPUMS) National Historic geographic information system (GIS) (10) and spatialized. Pedestrian-accessible routes were identified by filtering the LION dataset (Linear Integrated Ordered Network) from NYC Department of City Planning.(11) Pedestrian-accessible network buffers were then created by measuring ¼ mile (~ 400m) along the network from each participant’s home location. Race/ethnicity for each participant’s “neighborhood” was then calculated by using areal weighting,(12) meaning that the area of the portion of each CBG which is intersected by the buffer is calculated, and the ratio of each intersected CBG area to total CBG area is used to weight the population counts (e.g., if CBG “A” has 25% of its area within the buffer, we assume that 25% of its population is also within the buffer). The neighborhood MHHI was then calculated using population-weighted means based on the areal weighting results (e.g., if CBG “A” has 100 residents within the buffer and a MHHI of $10,000, and CBG “B” has 50 residents within the buffer and a MHHI of $40,000, the population-weighted mean MHHI would be ((100*$10,000) + (50*$40,000)) / (100 + 50) = $20,000). Areal- and population-weighting, in combination with the utilization of pedestrian-accessible network buffers, aid in reducing the impact of edge effect, the modifiable area unit problem, and other sources of geospatial-related error.(13) Participant survey data includes race/ethnicity categories and household income (HHI) presented in $20,000 intervals.
Calculation of concordance / discordance for categorical variable
To compare participant race/ethnicity with neighborhood-level characteristics, the majority race for each neighborhood was calculated (i.e., > 50% of one race/ethnicity). If there was no majority, it was coded as “No Majority.” Concordance was defined as when the individual’s race/ethnicity is the same as that of the majority group in the neighborhood and discordance when they were different from the majority group. Participants were coded as “Neutral” if there was no majority in the neighborhood.
Calculation of concordance / discordance for continuous variable
To compare participant-reported HHI and neighborhood-level MHHI, quintiles were calculated for NYC based on CBG-level data. The break values for the quintiles (Q1, Q2, Q3, Q4, Q5) were then modified to match the nearest break values in the survey data, resulting in < $40,000 (Q1), $40,000 to < $60,000 (Q2), $60,000 to < 80,000 (Q3), $80,000 to < $100,000 (Q4), and >= $100,000 (Q5). Concordance was defined as individual and neighborhood HHI being in the same quintile (e.g., both participant and neighborhood HHI are in Q1), discordance when there is more than one quintile between individual and neighborhood HHI (e.g., the individual’s HHI is in Q1, but the neighborhood HHI is in Q3, Q4, or Q5). Participants were coded as “Neutral” when their HHI quintile was only one class away from the neighborhood (e.g., participant in Q1 and neighborhood in Q2).