Ours is the first study to examine association of multiple population-level factors with the county-level variations in initial incidence and case fatality risk of COVID-19. We focused primarily on initial community spread so as to identify populations with higher susceptibility for COVID-19 infection and fatality. We found significant variation in the incidence (median: 193.4 per 100,000 population; inter quartile range (IQR): 94.2-397.5) as well as case fatality risk (CFR) of COVID-19 (median: 3.6%; IQR: 1.4–7.3) for the initial 4- week period.
We also identified various independent predictors of initial incidence of COVID-19. The positive association with higher median age, male sex, and chronic medical conditions (obesity and COPD) is in accordance with the various individual-level risk factors described by numerous clinical studies [6–10]. The elderly male populations with higher chronic disease burden are likely to have high susceptibility for COVID-19.
Interestingly, female sex was negatively associated with higher incidence. Biological susceptibility, occupational roles as well as responsible behavior with regard to following public health guidelines might explain this. Excessive drinking was also found to be strong protective factor, which could be explained by less mobility and social interaction by this population. On the other hand, population density was positively associated with higher incidence, supporting the role of social mobility in driving the spread of infection. All of these factors underscore the utility of social distancing in slowing the transmission of COVID-19. Additionally, higher education was negatively associated and percent uninsured population was positively associated with highest quartile of incidence. This highlights the importance of regular academic education as well as health education (percent uninsured population as proxy) in slowing the spread of the virus.
Furthermore, we identified independent predictors of case fatality risk of COVID-19 during initial community spread. Higher age and female sex were the strongest predictors associated with higher CFR, as shown by other individual-level clinical studies [14–17]. We also found significant positive association of Asian race with higher CFR, whereas Hispanic ethnicity was found to be negatively associated. Non-Hispanic black race was not found to be significantly associated with higher CFR. Various other studies have found non-significant association of black race with CFR [27–29], while some have shown significantly higher mortality . Further research is needed in this area.
Unexpectedly, we did not find association of higher CFR with the prevalence of any of the included chronic medical conditions, except adult obesity. Adult obesity was negatively associated with the highest quartile of CFR (aOR: 0.95; 95% CI: 0.90, 0.99), supporting the ‘obesity paradox’. Obesity paradox has been described as an association of obesity with decrease in mortality in patients with acute respiratory distress syndrome (ARDS), reported previously in various studies [31–33]. However, whether such a phenomenon also holds true for ARDS following COVID-19 infection is not yet clear [32, 34].
Moreover, we found that fine particulate matter (PM 2.5) was not associated with CFR. This is in consonance with another nation-wide cross sectional study on effect of air pollution, which showed insignificant effect of PM 2.5 and Ozone, but significant effect of NO2 on C0VID-19 death outcomes . We also did not find independent association of smoking with CFR. However, different meta-analyses have identified significant associations of smoking with severe complications as well as higher mortality from COVID-19 [35, 36].
Surprisingly, availability of healthcare resource, defined by number of Intensive Care Unit beds and number of airborne infection isolation rooms, was found to be positively associated, although weakly, and uninsured population was found to be negatively associated with the highest quartile of case fatality risk. The lesser disease burden as well as rapidity of spread during the initial weeks of epidemic in each county might explain this contradictory effect of healthcare resources availability on CFR variation. A study in China showed that the rapid escalation in the number of infections around the epicenter of the outbreak (Wuhan city) resulted in an insufficiency of health-care resources, thereby negatively affecting mortality in Hubei province, but not in other provinces of China .
Our study included an assessment of comprehensive range of factors with potential predictability role for the spread and fatality of COVID-19. In contrast to other population-level studies on COVID-19, we were able to control for major confounding by epidemic timing as well as stage of the epidemic by identifying a common starting point for each county (i.e. reporting of first 100 cases). We also were able to control for the unmeasurable effect of various factors such as diverse weather, varied social distancing norms, different timing of stay-at-home orders, etc. by including the group effect for each state.
However, we do acknowledge that our study is limited in several key areas. Firstly, the data on confirmed cases and deaths of COVID-19 at CSSE at Johns Hopkins University is derived from publicly available data from multiple sources such as the World Health Organization, the U.S. Centers for Disease Control and Prevention, state and national government health departments, local media reports, etc [1, 2]. Because of the different COVID-19 case definitions used by different organizations, there could be an artificial variability in the data itself. Secondly, the case fatality risk estimation used does not provide the true rate, as there is a substantial lag of reported deaths among reported cases (most hospitalizations take 2–3 weeks till experiencing mortality) . However, this is the limitation for all population-level studies. Thirdly, because of limited sample size, we were not able to control for all the plausible confounders in our modeling. Fourthly, we did not look at some other potential factors as it was beyond the scope of this study. Specifically, we could not examine the effect of important chronic medical conditions identified by various other studies, such as hypertension , chronic heart disease , cancer , etc. as well as other air pollutants such as NO2 & Ozone [22, 42, 43]. Fifthly, few chronic medical conditions' data (asthma, COPD, chronic kidney disease) used in this study was obtained from CMS . This is a Medicare beneficiary data and hence is not generalizable to the general population. Caution should be taken while interpreting the findings with respect to these three factors.
Since the beginning of the pandemic of novel coronavirus, there have been numerous efforts to build better prediction models. However, the predictability of these models has not been up to the expectation. The predictors identified by our study will definitely help build better models. Additionally, these findings may help identify most susceptible and high-risk populations and target public health interventions to focus areas. Lastly, our study also highlights the importance of social distancing as well as health education.
To summarize, we identified various county-level independent predictors of initial incidence as well as case fatality risk of COVID-19. The findings can help build better future prediction models. The results also support targeted public health actions by identifying susceptible and high-risk populations as well as counties.