Association between Racial Residential Segregation and COVID-19 Mortality

This study investigates the impact of racial residential segregation on COVID-19 mortality during the first year of the US epidemic. Data comes from the Center for Disease Control and Prevention (CDC), and the Robert Wood Johnson Foundation's and the University of Wisconsin's joint county health rankings project. The observation includes a record of 8,670,781 individuals in 1488 counties. We regressed COVID-19 deaths, using hierarchical logistic regression models, on individual and county-level predictors. We found that as racial residential segregation increased, mortality rates increased. Controlling for segregation, Blacks and Asians had a greater risk of mortality, while Hispanics and other racial groups had a lower risk of mortality, compared to Whites. The impact of racial residential segregation on COVID-19 mortality did not vary by racial group.


Introduction
The SARS-CoV-2 (COVID-19) pandemic has had catastrophic effects on populations around the world.By March 21, 2023, over 761 million individuals had been infected, and over 6.8 million globally died from COVID-19 [1].The infection and death rates in the USA stood at over 90 million cases, with 1.1 million deaths as of March 10, 2023 [2].
Racialized minorities in the USA experience higher burdens of illness and mortality than White people [3].COVID-19 has proven to be no exception.At the county level, majority Black counties experienced three times the rate of COVID-19 infection and nearly six times the death rate compared to majority White counties [4][5][6][7].This finding holds at the individual level as well.Black patients with COVID-19 experienced more severe illness, 1.4 times the risk of hospitalization, and 1.36 times greater risk of dying of COVID-19 compared to Whites [8][9][10].These disparities are not limited to Blacks and Whites.Young and middle-aged Hispanics, Native Americans, and Native Alaskans also were more likely to die of COVID-19 compared to Whites [11].A study focusing only on insured patients, however, found that no racial differences in COVID-19 deaths remained after controlling for pre-existing chronic health conditions [12].
These pre-existing chronic conditions, that increased vulnerability to COVID-19, also are unequally distributed with racialized minorities having higher rates of Vol:.(1234567890) underlying chronic health conditions.Chronic health diseases are associated with later life; however, recent research has found that Blacks develop chronic conditions 5 to 10 years earlier compared to other racial groups [13,14].
Hypertension and kidney disease, in particular, are chronic conditions associated with severe COVID-19 outcomes [15].Both Blacks and Hispanics have higher prevalence rates of hypertension compared to Whites [16][17][18], as well as higher prevalence rates of end-stage renal disease compared to Whites [19,20].
The US media quickly noted racialized minorities increased vulnerability to COVID-19 but implied that the lifestyles of minorities led them to greater overall poor health.They never questioned if the same structural and systemic inequalities that are the root or fundamental cause of racialized disparities in chronic health conditions also put minorities at greater risk of COVID-19 [21,22].Dressler et al. [23] challenge biological and lifestyle differences as having poor explanatory power for racialized disparities in chronic conditions compared to systemic structural inequalities.
COVID-19 is an infectious disease, not a chronic disease.It is caused by a virus that is spread through the air.Viruses such as these are equal opportunity infectious agents that do not discriminate.Why then do we see racialized disparities in COVID-19 deaths?It is important to determine if the same systemic factors underlying chronic conditions also put racialized minority groups at greater risk of infectious disease.
Much public health research has used individuallevel race as a proxy for the systemic racism effects of segregation rather than understanding how systemic racism leads to poor health outcomes [22].In this study, we examine if existing structural disparities as manifested through racial residential segregation contribute to racialized disparities in COVID-19 mortality in urban areas across the USA, as has been demonstrated with chronic illnesses [24][25][26][27][28][29].Studies have demonstrated the enduring negative impact of structural racist practices such as redlining, which excluded racial minorities from specific neighborhoods, leading to racial residential segregation [30], and contributing to racial disparities in health outcomes [31].
Racial residential segregation has long characterized urbanization patterns in the USA [32].In the late nineteenth and early twentieth century, various migrations of Blacks alongside discriminatory practices rooted in White supremacy shaped the residential distribution of races, thus contributing to growing ethnic residential segregation across cities in the USA [32][33][34][35].The prevalence of redlining, along with other exclusionary tactics such as denying full access to housing markets through discriminatory practices used by landlords and mortgage lenders, especially against the Black population, has contributed to the continued racial segregation of the urban population [36][37][38][39].
Racial residential segregation, which also is associated with racial differences in socioeconomic status (SES), is associated with a larger system of inequality [29,36,[40][41][42].Thus, racialized groups live in vastly different environments with differential access to resources and opportunities, which leads to disparities in chronic health outcomes.
Studies across cities in the USA have shown associations between segregation and racialized disparities in chronic health outcomes [24-29, 43, 44], suggesting the need to consider both individual-level race and racial residential segregation when investigating racialized disparities in COVID-19 [36].Rather than solely controlling for race, which assumes an equal role across all racial groups despite evidence to the contrary, our study adopts an interaction model to examine the associations between race and residential segregation [36].Multilevel analysis plays a vital role in examining both the structural disparities, other aspects of community context, and individual level factors, on individual-level health outcomes [45].
Considering that aggregate measures of SES and race are confounded, we additionally control for county-level poverty and population density, in order to isolate the effect of racial residential segregation on COVID-19 mortality [44][45][46][47].More densely populated areas are at greater risk from respiratory infectious diseases [48].At the individual level, we control for age and gender.Controlling for gender allows us an examination of the specific impact of residential segregation on COVID-19, while considering gender-related factors, given the gendered differences in health behaviors [49] and COVID-19 cases and mortality rates [50].A majority of research supports that increased COVID-19 mortality risks are found among the older adult population [51,52].
The merged dataset initially included 24,441,351 individuals who contracted COVID-19 within 2814 US counties.To be consistent with earlier work on racial residential segregation and health, we limit the sample to urban and suburban counties.The National Center for Health Statistics has developed an urban-rural county classification scheme specifically designed to work with health data [54].Based on their definition, 1306 counties defined as noncore metropolitan areas were removed from the dataset.This reduced the individual-level sample to 14,134,595 (see Table 1).
Next, we removed all cases missing on the dependent variable, survival versus death from COVID-19, which reduced the sample to 11,781,056 cases in 1504 counties.Then, we eliminated all missingness on categorical race which dropped the sample to 8,710,401 in 1488 counties, reflecting a reduction of over 66% from the original sample size.Lastly, item missingness on all remaining covariates was listwise deleted for a final analysis dataset size of 8,670,781 in 1488.
We were concerned about the high level of missingness on both COVID-19 survival and individuallevel race in the dataset because the less-than-ideal quality of the data may bias our findings.In this section, we explore data missingness and while this is interesting in and of itself, the goal is always to understand the associations between racial residential segregation and racial disparities in COVID-19 mortality.
Research has documented that existing COVID-19 datasets have widespread missingness or incompleteness [55,56].COVID-19 data are sent to the CDC voluntarily from state, local, and territorial public health departments.The different locales may use different definitions or reporting rules leading to discrepancies [57].The CDC also finds that some demographic data (race and SES) are missing due to the strain that COVID-19 surveillance placed on state reporting agencies.This is consistent with the fact that only four counties were dropped when we listwise deleted missingness on the individual-level race variable.
Another reason for missingness could be due to at-home tests being exempt from test reporting mandates [56].Positive at-home tests may or may not be reported, but even if reported, mild or nonexistent symptoms may have led to no further contact with medical facilities or reporting agencies leaving the outcome unknown.
It is likely that missingness is associated with individual-level race.For example, one study found that only twelve states reported racialized breakdowns of hospitalizations [55].Patel et al. [58] found that Blacks were less likely to trust their doctors and federal health agencies compared to Whites while Asians expressed greater trust than Whites.To the extent that trusting the medical structure is associated with reporting, this may impact missingness on survival outcomes.Other research found that younger, higher SES Whites were more likely to use at-home tests [56].
There may be errors within the survival versus death outcome variable as well.Research finds that undercounting of deaths is more likely than overcounting based on the increase in deaths since the pandemic onset when compared to the same timeframe in prior years [59,60].Reporting causes of death can differ across agencies within the same state creating noise in the survival rates [57].This suggests that there is a good bit of noise in these data; however, it is the best data currently available.Results should be interpreted with some caution, however.
Table 2 presents an examination of the CDC COVID-19 dataset compared to the 2020 Census population data broken out by race in order to understand what exactly we have in the COVID-19 dataset.As a reminder, all rural counties have been dropped so the CDC dataset is an urban and suburban sample.First, it appears that Whites, Blacks, Asians, Native Americans, and multi or other racial groups are underrepresented in the COVID-19 sample compared to their representation in the 2020 Census, while Latinx and unknown category are highly overrepresented.
Just over 16% of the CDC sample has an unknown COVID-19 outcome.Half of those with an unknown survival outcome were also missing on race.The unknown racial group, which makes up 30% of the CDC sample is underrepresented in mortality with just 12% dead.These two pieces of information are consistent with the idea that those who tested positive with mild or no COVID-19 symptoms did not have enough interaction with healthcare facilities for demographic or outcome data to be collected.Therefore, the remaining data is likely generalizable to those with more severe COVID-19 symptoms.
The proportions for each racial group found in the CDC dataset (column labeled sample size) and the reduced CDC dataset (dropping all with unknown This analysis tells us that our final sample likely consists of those with more severe COVID-19 symptoms and that three racialized minority groups were overrepresented among those with more severe symptoms.

Constructs
The dependent variable is measured at the individual level as a binary variable where death from COVID-19 equals 1 and survival is 0. We include measures of race, gender, and age at the individual level.Dummy variables were created for the following racial groups: Whites, Blacks, Hispanic, Asians, and others (which include American Indians, Alaska Natives, multiple/other, Native Hawaiian/other Pacific Islanders).Gender was measured as a dummy variable with women (1) and men (0).Age originally was measured as a categorical variable starting with 0-9, 10-19, 20-39, 40-49, 50-59, 60-69, 70-79, and 80+ years.We recoded age as a dummy variable with age 60 or greater equal to 1 and less than sixty equals 0.
At the county level, we measured racial residential segregation as the dissimilarity index.The index measures the evenness of racial distributions geographically, ranging from zero, or a completely even distribution of the races across census tracts, to one hundred, or a completely uneven distribution of the races across census tracts within counties.The dissimilarity index was calculated using the following formula where P1 and P2 are the county-wide populations of people of color and White (not Hispanic/Latino), respectively.P1i and P2i are census tract-level populations of people of color and Whites, respectively, summed over 1488 counties.
The dissimilarity index can be interpreted as the proportion of people of color that would have to change their tract of residence to a White tract to equalize the distributions of the different racial groups [61].The segregation variable was grand mean centered for the analyses.Our study calculated the dissimilarity index using census-tract data to measure county-level racial segregation, but without accounting for uneven distributions within tracts and relying on ACS estimates, introducing a potential margin of error [62].Despite the limitations, we chose the widely used dissimilarity index to be consistent with existing research and because it is relevant to our research question, focusing on the distribution and separation of racial groups in residential areas.As Massey and Denton [63] found many of the segregation indices are highly correlated.While other segregation measures exist [64], this index specifically captures the evenness dimension of segregation, indicating how racial groups are distributed across census tracts within a county.
The population density was calculated as the average number of people per square mile of the land area [53].The population density was log transformed to reduce skew and grand mean centered for the analyses.This controls for the fact that racialized minorities and lower socio-economic status (SES) populations tend to live in more densely populated places.Finally, we include a countylevel measure of SES, the percentage of the county that lives in poverty.This variable was also log transformed to reduce skew.The CDC COVID-19 data lack individual-level measures of SES.Including this county-level poverty measure ought to pick up some of the individual-level SES variations that might otherwise bias the association between individual-level race and COVID-19 deaths.
Lastly, interaction terms were computed between individual-level racial group dummy variables and the county-level index of dissimilarity in order to determine if the relationships between racialized groups and COVID-19 vary by level of racial residential segregation. (1)

Statistical Analysis
As individuals were nested within counties in this analysis, hierarchical logistic regression models were conducted to examine the correlations between COVID-19 deaths, race, and racial residential segregation.The lme4 package in R was used for these analyses.The following random intercepts model was run: where γ 00 ~ N (β 0 , σ 2 ) where P ij represents the probability of death occurring for individual i in county j, γ 00 is the intercept, γ 10 X ij represents the individual-level predictor variables, γ 01 W j represents the county-level predictors, and γ 11 W j X ij represents the cross-level race by residential segregation interaction terms.Finally, there were two random terms in the model.The random term μ 0j is the unmodeled level-2 variability for each county j, and r ij allows for individual variation within county j. (2)

Results
Descriptive statistics for the analysis sample are shown in Table 3.The top half of the table provides individual-level descriptive statistics.From January 1, 2020, through April 15, 2021, 4% of those who had been diagnosed with COVID-19 died.In terms of demographic composition, women account for 53% of the sample.Whites comprise 46%, Blacks make up 10%, Hispanics make up 33%, Asians are 4%, and other racial groups make up 8% of the sample.Twenty-one percent of the sample is age 60 or older.
The bottom half of Table 3 shows the descriptive statistics for county-level variables.The average dissimilarity index is 36.This means that, on average, 36% of people of color would need to relocate to a different census tract to achieve an even racial distribution across the county.The dissimilarity index ranges from 0 (no segregation) to 80 (very high segregation).In terms of poverty, on average, almost 12% of each county's residents are living in poverty, and this ranges across counties from 3 to almost 40%.The average county-level population density is 562 persons per square mile.This ranges from a low of 0.58 to a high of 27,820 per square mile.Lastly, the average cluster size within each county is 5827 individuals, while cluster sizes across counties range from 1 to 820,577 individuals.
Table 4 presents odds ratios and confidence intervals from three nested random intercept hierarchical logistic regression models.An odds ratio greater than 1.0 can be interpreted as increasing the likelihood of mortality from COVID-19, and an odds ratio of less than 1.0 reduces the likelihood of mortality from COVID-19.Reference categories are Whites, age less than 60 years, and men.
Model 1 tests hypothesis 1, which examines the higher likelihood of COVID-19 mortality among racialized minorities compared to Whites.We find different patterns of COVID-19 mortality risk based on race.Blacks and Asians who have contracted COVID-19 are at greater risk of death compared to Whites (20% and 15%, respectively).On the other hand, Hispanics and other racialized groups have a reduced risk of death compared to Whites (16% and 37%, respectively).Thus, hypothesis 1 is only partially supported as only Blacks and Asians demonstrate a greater risk Model 3 tests hypothesis 3, which investigates whether the association between racialized minority groups and COVID-19 mortality varies depending on the level of residential segregation by incorporating cross-level race by segregation interaction terms.The main effects for the race dummy variables can be interpreted as the gaps in COVID-19 mortality risk at for those living in counties with average racial residential segregation.The main effects for the individual-level race variables change only slightly when introducing the interactions.Blacks and Asians living in counties with average levels of residential segregation are 18% and 14% more likely to die of COVID-19 compared to Whites, respectively.In contrast, Hispanics and other racialized groups are 16% and 39% less likely to die of COVID-19 compared to Whites.The main effect of racial residential segregation can be interpreted as the impact of racial residential segregation for Whites.For Whites, as segregation (the main effect) increases by 1%, COVID-19 deaths increase by 1% on average.For Blacks, as racial residential segregation increases by 1% above average, Blacks probability of mortality increases by an additional 0.2%.For each additional unit of segregation above the mean, the risk of death increases for other racialized groups by an additional 1.3%.Conversely, as segregation increases above the mean for Hispanics and Asians, the risk of death declines (0.4% and 0.9%, respectively).
Again, we find only partial support for hypothesis 3, indicating that Blacks, Whites, and other racial groups residing in more segregated counties have a higher likelihood of COVID-19 mortality compared to Blacks, Whites, and other racial groups living in less segregated counties.For Hispanics and Asians, living in more segregated counties reduces the likelihood of COVID-19 compared to less segregated counties.In all the models, the variance inflation factor (VIF) values for the variables were 1 or slightly more than 1, suggesting no evidence of multicollinearity.
These are nested models; therefore, we can test for model fit using a change in −2 log likelihood (−2LL) across the three models.Model 2 significantly improves fit over model 1, and model 3 significantly improves fit over model 2.

Sensitivity Analysis
Given the many issues with this dataset and its large size, we ran several sensitivity analyses that can be found in Table 5. Model S1 in the first column assumes missingness on the survival outcome is missing at random, and we used multiple imputations to impute missingness using the R package, "mice," and re-ran model 3 [65].This model assumes that missingness is completely at random.If this model best approximates our Table 4 model 3 (column 5 in Table 5), then we can assume missingness on the outcome is random.Model S2 assumes missingness on the outcome was all survivals (changing missing to 0 on the outcome).This models our understanding that missingness on the survival outcome may be due to people with mild or no symptoms stopping contact with medical authorities.If this assumption is correct, these Model S1, using multiple imputations, provides results that are a bit muddled.This may be a combination of factors, such as regression to the mean as some unknowns are treated as deaths and others as survival, but also, it may well be that we have violated the assumptions of multiple imputations with a large amount of missingness on the outcome as well as the biased reasoning for the missingness.
Treating the unknown on the outcome as survival (model S2) most closely replicates our Table 4 model 3 results, in terms of significance and magnitude of the regression coefficients.This further strengthens our belief that the missingness on the outcome is due to milder COVID-19 symptoms and lack of continued medical interactions.
Model S3, which treats the unknown outcomes as all deaths, changes the findings considerably compared to Table 4 model 3.This is especially noticeable when looking at age.The odds ratio for age over sixty in model 3 is ten times smaller than it is in Table 4 model 3.This means that the majority of those missing on the outcome are younger than age sixty and likely had low or no symptoms and that is why their outcome was not captured-rather than due to unreported deaths.
Based on these analyses, our assumption of the sample consisting of those with more severe COVID-19 symptoms holds.Thus, we can only generalize the impact of racial residential segregation on racial groups with severe COVID-19 symptoms.
The random subsample analysis (model S4) also does a decent job of replicating the findings from Table 4 model 3.However, not all the individuallevel racial groups or interactions remain statistically significant.This suggests that the power we have to detect effects in the models of Table 4 might be giving us type I errors, and thus we should be interpreting the results conservatively.

Discussion and Conclusion
This study focused on examining the relationship between racial groups, racial residential segregation, and COVID-19 death to contribute to the existing literature that suggests that racial residential segregation is an important factor in the prevalence of health inequalities [41,42].We found that racial residential segregation is associated with COVID-19 mortality in urban and suburban American counties.Notably, within our dataset, approximately 21% of the data originates from urban counties in the southern states, commonly referred to as the "Deep South," [66] including Alabama, Florida, Georgia, Louisiana, Mississippi, North Carolina, South Carolina, Tennessee, and Texas that have a history of institutionalized racial residential segregation [67].
Blacks have a higher likelihood of dying from COVID-19 compared to Whites.As segregation increases, both Blacks and Whites have an increased likelihood of death.Asians at average levels of segregation have a greater likelihood of death compared to Whites, but as segregation increases, Asians' risk of death decreases.This is an unusual finding; however, sensitivity analyses suggest that this significant finding may be a type 1 error.Asians, who represent 6% of the total US population, were underrepresented among those who contracted COVID-19.This finding may be a statistical artifact given the low overall COVID-19 mortality rate of 4% and the small proportion of Asians in the sample.Given that we suspect the sample consists mainly of the most severe COVID-19 cases, future research should explore if only Asians with severe symptoms visited medical facilities which would explain to some extent their greater likelihood of death compared to Whites that decreased as segregation increased [68].
Hispanics are overrepresented in the COVID-19 dataset meaning they are more likely to contract COVID-19 compared to Whites.Findings suggest they are less likely to die; however, compared to Whites, and as segregation increases, mortality risk for Hispanics declines.Hispanics' greater risk of Vol:.(1234567890) COVID-19 infection may be due to their overrepresentation among essential workers [69].Hispanics are also more likely to live in multigenerational households increasing the risk of spreading COVID-19 from essential workers to family members [70].However, Hispanics have a lower likelihood of dying from COVID-19 compared to Whites, and as segregation increased, their risk of death decreased even more.This may be due to the "Hispanic Paradox," which suggests that despite their lower SES, Hispanics have a health advantage [71][72][73].Palloni and Asias [74] found that the "Hispanic paradox," was found only among foreign-born Mexicans, but not Cubans, Puerto Ricans, or others.
Our findings are consistent with multiple studies that have provided evidence that racial residential segregation plays a crucial role in influencing racial disparities in health outcomes by limiting access to resources, reinforcing racial inequality in socioeconomic status, and contributing to concentrated poverty in communities of color [41,42,75].Furthermore, studies that focus on other forms of structural inequalities that contribute to racial residential segregation, such as gentrification, [76] White flight [77], and the enduring legacy of redlining, [31] have shown their impact on individual health outcomes, particularly among people of color.This study's finding that the likelihood of death increases for Blacks as residential segregation increases highlights the potential impact of the segregation in impeding their access to education and employment opportunities, as well as its influence on conditions such as poverty and inadequate housing, which previous research [36,41] has demonstrated to have significant implications for individual health, as racialized minorities are often confined to disadvantaged urban areas.This underscores the potential policy implication that reducing racial health disparities necessitates targeting improvements in socioeconomic conditions, not just at the individual level, but also at the geographic level, such as areas of racial residential segregation [41].
There are some limitations to this study.First, the data are problematic due to underreporting of COVID-19 outcomes and patients' race.While we are fairly certain our assumptions regarding missingness on the outcome due to mild to no COVID-19 symptoms are valid and upheld by the sensitivity analyses, they are merely assumptions.
In addition, by excluding rural areas from the study, as is typical of studies examining residential segregation, the generalizability of the findings to the entire population is limited to urban and suburban counties.Secondly, we do not have measures of individual-level SES available in the data.Measures of social class are often confounded with race [78].We include a measure of county-level poverty, which is likely to pick up some of the associations of individual-level SES and COVID-19 mortality, but does it pick up all of those associations?We cannot be sure that some of the associations between individual-level race and COVID-19 deaths are not attributable in part to the missing individual-level SES variables.Finally, no data were collected on pre-existing chronic conditions like hypertension, diabetes, or kidney disease which likely confound our results to some extent.However, we know that systemic racism also increases racialized health disparities in these chronic conditions [16,[18][19][20].
Despite these limitations, this is one of the first studies to examine the impact of systemic racism proxied by racial residential segregation on racial disparities in an infectious disease.Additionally, we examined multiple racialized groups.We found partial support for our three hypotheses demonstrating that systemic racism plays an important role in infectious health disparities as well as chronic health disparities.

Table 1
Adjustments to merged COVID-19 and County Health Rankings Dataset*This aspect makes our study generalizable to urban and suburban counties and is not flawed by missing data.Therefore, for the analysis of the impact of missing data on the sample, we will treat the 14,124,595 as our starting point (100% of the valid sample)

Table 2
Comparing CDC COVID-19 dataset to 2020 US Census by race and COVID-19 outcome a Reduced after listwise deleting all whose survival is unknown on the COVID-19 outcome

Table 3
Descriptive statistics of variables included in analyses

Table 5
Sensitivity analyses