Time-varying associations between COVID-19 case incidence and community-level sociodemographic, occupational, environmental, and mobility risk factors in Massachusetts

Background: Associations between community-level risk factors and COVID-19 incidence are used to identify vulnerable subpopulations and target interventions, but the variability of these associations over time remains largely unknown. We evaluated variability in the associations between community-level predictors and COVID-19 case incidence in 351 cities and towns in Massachusetts from March to October 2020. Methods: Using publicly available sociodemographic, occupational, environmental, and mobility datasets, we developed mixed-effect, adjusted Poisson regression models to depict associations between these variables and town-level COVID-19 case incidence data across five distinct time periods. We examined town-level demographic variables, including z-scores of percent Black, Latinx, over 80 years and undergraduate students, as well as factors related to occupation, housing density, economic vulnerability, air pollution (PM2.5), and institutional facilities. Results: Associations between key predictor variables and town-level incidence varied across the five time periods. We observed reductions over time in the association with percentage Black residents (IRR=1.12 CI=(1.12–1.13) in spring, IRR=1.01 CI=(1.00–1.01) in fall). The association with number of long-term care facility beds per capita also decreased over time (IRR=1.28 CI=(1.26–1.31) in spring, IRR=1.07 CI=(1.05–1.09)in fall). Controlling for other factors, towns with higher percentages of essential workers experienced elevated incidence of COVID-19 throughout the pandemic (e.g., IRR=1.30 CI=(1.27–1.33) in spring, IRR=1.20, CI=(1.17–1.22) in fall). Towns with higher percentages of Latinx residents also had sustained elevated incidence over time (e.g., IRR=1.19 CI=(1.18–1.21) in spring, IRR=1.14 CI=(1.13–1.15) in fall). Conclusions: Town-level COVID-19 risk factors vary with time. In Massachusetts, racial (but not ethnic) disparities in COVID-19 incidence have decreased over time, perhaps indicating greater success in risk mitigation in selected communities. Our approach can be used to evaluate effectiveness of public health interventions and target specific mitigation efforts on the community level.

Conclusions: Town-level COVID-19 risk factors vary with time. In Massachusetts, racial (but not ethnic) disparities in COVID- 19 incidence have decreased over time, perhaps indicating greater success in risk mitigation in selected communities. Our approach can be used to evaluate effectiveness of public health interventions and target speci c mitigation efforts on the community level.

Background
As of January 2021, the United States had the highest absolute number of coronavirus disease 2019 (COVID- 19) cases and deaths in the world. 1 Within the US, disease incidence has varied substantially among states with different policy interventions and adherence to public health guidance, [2][3][4] and there is also signi cant variability within states. 5 For example in Massachusetts, during the two-week period from January 10-23, 2021, COVID-19 average daily incidence exceeded 100 con rmed cases per 100,000 persons in multiple urban communities (including Chelsea, Lawrence and New Bedford), with a low of zero in a number of more rural communities. 6 Early in the pandemic, multiple community-level factors were associated with higher COVID-19 incidence rates, with disproportionate burdens among communities with more racial and ethnic diversity and workers in essential services. 7 As the pandemic evolved over time, factors associated with elevated case incidence are likely to have evolved as well, given changes in work patterns, behaviors, and state and local policies. To our knowledge, no studies to date have investigated the time-dependent association of COVID-19 incidence rate at a town level with key predictors.
In this study, we present a novel set of models analyzing associations between publicly available town-level data and COVID-19 incidence, with the goal of evaluating changes in key town-level predictors across Massachusetts over time. We ran parallel regression models with the same set of predictors over ve different time periods from the rst recorded case in Massachusetts in March 2020 to October 2020 to assess the temporal variability of the relative magnitude and signi cance of these predictors on COVID-19 incidence. Characterization of the time-variant nature of these predictors may assist in future efforts to target interventions to speci c subpopulations and assess the effectiveness of mitigation strategies on the community level over time.

Page 3/11
Methods Time periods: We used publicly available case incidence data for 351 cities and towns published by the Massachusetts Department of Public Health (MA DPH) from April 14, 2020 through October 29, 2020. 8 Case data were centrally reported to MA DPH by all laboratories across the state that conducted COVID-19 testing, and cumulative case counts and rates published weekly. We strati ed that case data across 5 time periods selected to re ect distinct waves of case incidence in the state, starting with the spring " rst wave" (March 2 -April 14, 2020, and April 15 -June 3), the summer nadir (June 4 -July 15, 2020 and July 16 -September 2, 2020), and the early months of the "second wave" in the fall (September 3 -October 29, 2020). The speci c end date was chosen based on COVID-19 case incidence data availability at the time of study.
Data sources: Table 1 lists the sociodemographic, occupational, environmental, and mobility datasets at the city/town level which were integrated with data on COVID-19 incidence during the ve time periods. Sociodemographic, occupational and economic data were extracted from the 2014-2018 American Community Survey (ACS) 5-year estimates at the census tract level. Data were scaled to town level by downloading ACS data in absolute numbers, aggregating to town level, and proportions calculated with ACS town population estimates. 9 Some rural census tracts contained several smaller towns, thus these had identical ACS percentage values. Estimates for percentage of essential workers were modi ed from ACS data based on an integrated dataset developed by the American Civil Liberties Union, which restricted ACS data on service workers to those job types considered "essential" during the pandemic including healthcare practitioners, transportation occupation, food preparation, etc. 10 Data on town-level imprisoned population were obtained through MassGIS 11 and longterm care beds by town were obtained from MA DPH. 8,12,13 The percentage of people commuting to work was estimated using the SafeGraph Social Distancing Metrics dataset derived from anonymized cell phone mobility data, where a commuter is de ned as someone who spends 8 hours a day M-F in a single location away from their home block group. 14 Annual PM 2.5 concentrations were obtained from Kloog et al. at 1 km grid resolution, averaged by town for the year of 2015, and linked to re ect a general measure of air pollution. 15 Population density, as a proxy for local transmission risk, was calculated per town using 100 meter cell size population grids derived from the 2010 Census counts available at the census block level (assuming equal spatial distribution of people within census blocks). For each town we averaged the cell values depicting population totals in each cell at different Euclidean distance radii (5,20,50, and 100 km) to capture the effects of urbanicity and population density at different scales. 16  as an offset to account for differences in population per town. To minimize multi-collinearity while maintaining a good t, we used a backward selection process of covariates where we stepwise excluded covariates with a variance in ation factor higher than 2.5 in the regression of the rst time period. 18 All numerical predictors were normalized to zero mean and standard deviation of 1. We included the county of each town as a random effect to control for spatial autocorrelation of residuals (351 towns divided over 14 counties). For every period except the rst period ending on April 14, we included an ordinal variable of the COVID-19 case count per capita of the previous period, divided into quintiles, as a random effect to adjust for between-period temporal autocorrelation. To assess collinearity structure of the predictor variables, we calculated Pearson correlations between all bivariate predictor combinations and presented them in a correlation matrix. Mixed-effect models were executed in R 3.6.1 19 using the glmer function from the lme4 package. 20

Results
The backward predictor selection process resulted in a set of eight xed-effect predictor variables that had a good t in all 5 phases of the pandemic between the beginning of March and the end of October 2020. Excluded variables either caused high levels of multi-collinearity or had no signi cant association with the response variable. The correlation matrix in Fig. 1 shows a cluster of population density variables with public transportation and PM 2.5 in the upper left corner and a cluster of housing crowdedness with percent of the population working in essential services, under the poverty line, or with disabilities.
Both clusters show positive correlations with % Black population and Hispanic/Latino (henceforth referred to as Latinx) population. Point estimates and associated 95% con dence intervals are presented for all eight predictors in the model across all time periods (Table 2).  Of the nal selection of predictors, almost all showed substantial change in association with COVID-19 cases over the study period (Fig. 2). Whereas an increase of one standard deviation in the percentage of town residents identifying as Black was associated with an increase of more than 10% in COVID-19 cases before April 14, the association steadily decreased and became statistically non-signi cant by September-October. At the start of the pandemic, we observed a nearly 20% increase in COVID-19 cases by town associated with one standard deviation increase in the percentage of town residents identifying as Latinx, but the strength of this positive association varied over time, with a decreasing association from April to September and an uptick in the association in the last phase of our model (September 3rd to October 29th ).
We observed a decreasing trend in the association between percent of the town population older than 80 years and the number of long term care facility beds per town with COVID-19 incidence, indicating that the presence of older residents and institutional care facilities in a town had a stronger association with incidence in the early, but not the later, periods of the study. The percent of essential workers by town consistently showed a signi cant positive association with COVID-19 incidence throughout all time periods.
Population density, measured as the total number of people living within 20 km from the boundary of each town, showed notable changes in association with COVID-19 case incidence, with high positive associations in the rst and fourth time periods, and non-signi cant associations during the other time periods. The undergraduate student population, measured as the percentage of the town population enrolled in undergraduate education, was associated with a small positive effect on COVID-19 incidence early in the pandemic and a larger association after September 3rd, when several universities throughout the state resumed with in-person or hybrid classes after shutting down for in-person learning in March/April 2020.

Discussion
Our ndings illustrate the time-varying nature of sociodemographic and economic predictors of COVID-19 case incidence at the community level in Massachusetts during an eight-month period. These observations suggest that xed assumptions regarding community-level COVID-19 vulnerability, such as increased risk among Black communities or continued elevated relative risk among the elderly, may not accurately represent the pandemic at any given point in time, and that these associations should be continually reassessed though time for relevance alongside shifting mitigation efforts, policies and individual behavior. While other studies have identi ed racial/ethnic disparities in COVID-19 incidence in Massachusetts and the extent to which these patterns are explained by social factors, 7 our study provides novel insight about how both patterns and predictors change over time. Likewise, associations that remain signi cant over time with consistent coe cients in these adjusted models, such as workers in essential services, may suggest that existing approaches to reduce risk within these subgroups are less effective, perhaps due to structural challenges in reducing certain exposures.
We note that the community-level predictor variables used here serve as proxies for underlying factors that in uence individual SARS-CoV-2 exposure risk. This distinction is important in our analysis, as it remains possible that analyses using individual-level predictors would lead to different ndings. However, our analysis provides insight into sociodemographic patterns of COVID-19 and subpopulations who may be at elevated risk, informing community-scale public health interventions, and vaccination strategies by location, as well as a useful and adaptable structure to assess risk alongside public data.
The consistently elevated risk observed in communities with increased Latinx populations (in models adjusted for essential workers and other sociodemographic variables) may underscore challenges in reaching these communities with successful interventions, or barriers to reduced exposure in these communities. The sustained elevated risk of Latinx populations throughout the rst eight months of the pandemic is consistent with recent studies. 7,21 In our models, Latinx population was positively correlated with greater housing density, suggesting that this ethnicity variable may serve in part as a proxy for crowded housing (> 1 person/room) in our models. Within-household transmission is an established risk factor for COVID-19 transmission, 22,23 and it is possible that our ndings here re ect established challenges in reducing transmission within housing environments where residents are unable to meaningfully distance from an infectious individual.
We did not observe a correlation between housing density and Black population in a town, or between this race variable and other sociodemographic covariates in our model. This nding suggests that other factors beyond the scope of our analysis may be responsible for the elevated COVID-19 risks faced by towns with higher percentage of Black residents, especially early in the pandemic. Our ndings parallel core conclusions of Figueroa et al., who evaluated cross-sectional COVID-19 incidence alongside demographic data in Massachusetts from March-May 2020. 7 The authors noted racial disparity in disease incidence in the early wave after adjustment for essential workers, immigration, and household size, while the association between COVID-19 cases and Latinx population was attenuated in models adjusted for these factors. Together, our studies support the hypotheses that systemic racism and inequities not otherwise captured in core demographic datasets play a role in driving COVID-19 racial inequities and that distinct analyses focused on systemic inequities are needed to fully understand the speci c risks faced by Black individuals and communities.
Reduced risk of COVID-19 incidence in communities with higher percentages of Black residents, adjusting for other factors, may suggest that the clear racial disparity observed in early months of the pandemic has diminished over time in Massachusetts. 5,24−26 Our nding parallels other observations as to the reduced racial disparity in COVID-19 cases over time. 27 Likewise, substantial reductions in the association between long-term care beds and percentage of town population over 80 years and COVID-19 incidence may re ect success in interventions to protect the elderly, especially those living in long-term or nursing care, after the initial, devastating impacts on this population early in the pandemic. 28,29 It remains possible that factors not included in our analysis, notably biases associated with testing availability or other unstudied correlates, are responsible for the reductions in risk we observe here, especially by race. Analysis of more severe outcomes, including hospitalization and deaths, would help inform these questions, although requiring less-nuanced temporal resolution, and should be prioritized for future work.
The continued positive association between essential workers and COVID-19 incidence may re ect the continued inability of this workforce to adequately protect themselves from infection, despite workplace controls and personal mitigation behaviors, such as masking and maintaining social distance. Essential workers remained at the highest risk of all subgroups studied in our models throughout the pandemic, highlighting key challenges in protecting these populations, even many months into the pandemic. Interestingly, however, a covariate representing people spending more than 3 hours in a location other than their home during o ce hours showed a null association with COVID-19 cases. The non-signi cant effect of worker mobility on case incidence may re ect limitations in the underlying data. SafeGraph data includes only 10% of cellphones in the US 30 , although the data are highly correlated with true Census population. 31 However, it could also indicate that changes in worker mobility (e.g. "return to work" efforts) were not associated with increased cases when these workers were not part of the essential workforce. This would reinforce that communities with more non-essential workers face distinctly lower risk than those with more essential workers, even with resumption of economic activities, highlighting inequities in exposure pro les on the job.
The signi cant association between the percentage of town residents without health insurance and case incidence in the spring and summer periods may suggest growing case incidence among immigrants during those times. Due to a state health insurance mandate, the number of non-insured individuals in Massachusetts is low (2.8% of the total state population), but some municipalities have uninsured rates up to 25%, such as Chelsea, Everett and Lawrence. These communities have higher proportions of younger adults, non-citizens, and those who are less educated than the insured population. 32 Targeted interventions that focus on communities with elevated percentages of uninsured persons are important in fully understanding disease dynamics in the state.
As noted, our study is limited by the use of town-level covariates, which reduce our ability to draw causal inferences on the individual level, though our insights remain relevant for targeted public health strategies. The use of static predictors from the ACS and other databases allowed us to evaluate and interpret changes in coe cient magnitude and signi cance over time, but gave us limited ability to capture time-dynamic exposure factors (with the exception of SafeGraph data). While most of these predictors are expected to be fairly stable during the study period, notably demographic data, it is possible that real-time changes in these covariates may have altered our ndings. It is possible that disparities in testing availability, as noted earlier, contribute to observed trends in case incidence if communities with greater racial diversity had reduced access to case identi cation, and this is an important area for additional work.

Conclusions
Our analyses demonstrate that the relevance and magnitude of risk factors on the community-level for COVID-19 are alterable, suggesting in part the relative effectiveness of intervention and mitigation efforts by population subgroups (or, conversely, factors that contribute to elevated risk varying over time). Our study highlights the need for local jurisdictions to use up-to-date data on vulnerable and high-risk populations to direct COVID-19 interventions, including vaccinations, rather than data from early in the pandemic. Our models were derived entirely from publicly-available data and could be rapidly re t to newer case incidence data. Community-level analyses can help characterize social inequities embedded in the pandemic and track the evolution of these inequities with time, highlighting successes as well as disproportionate burden experienced by vulnerable populations.