DOI: https://doi.org/10.21203/rs.3.rs-61369/v1
The COVID-19 outbreak has become a global pandemic. Spatial variation in the environmental, health, socioeconomic, and demographic risk factors of COVID-19 death rate is not well understood. Global models and local linear models were used to estimate the impact of risk factors of the COVID-19, but these do not account for the nonlinear relationships between the risk factors and the COVID-19 death rate at various geographical locations. We proposed a local nonlinear nonparametric regression model named geographically weighted random forest (GW-RF) to estimate the nonlinear relationship between COVID-19 death rate and 47 risk factors derived from US Environmental Protection Agency, National Center for Environmental Information, Centers for Disease Control and the US census. The COVID-19 data were employed to a global regression model random forest (RF) and a local model GW-RF. The adjusted R2 of the RF is 0.69. The adjusted R2 of the proposed GW-RF is 0.78. The result of GW-RF showed that the risk factors (i.e. going to work by walking, airborne benzene concentration, householder with a mortgage, unemployment, airborne PM2.5 concentration and percent of the black or African American) have a high correlation with the spatial distribution of the COVID-19 death rate and these key factors driven from the GW-RF were mapped, which could provide useful implications for controlling the spread of COVID-19 pandemic.
This preprint is available for download as a PDF.
Table 1. Definitions of indicators and sources
Theme |
Indicators |
Indicator meaning |
Source |
Atmosphere |
Airborne PM2.5 concentration |
Annual average ambient concentrations of PM2.5 in micrograms per cubic meter |
United States Environmental Protection Agency (https://www.epa.gov/) and Centers for Diseases Control and Prevention (https://www.cdc.gov/) |
Airborne benzene concentration |
Annual average concentration of benzene estimates in microgram per cubic meter |
||
Airborne formaldehyde concentration |
Annual average air concentration of formaldehyde estimates in microgram per cubic meter |
||
Airborne acetaldehyde concentration |
Annual average air concentration of acetaldehyde estimates in microgram per cubic meter |
||
Airborne carbon tetrachloride concentration |
Annual average air concentration of carbon tetrachloride estimates in microgram per cubic meter |
||
Climate |
Air temperature |
Average Daily Max Air Temperature ( ) |
National Center for Environmental Information (https://www.ncei.noaa.gov/) |
Precipitation |
Average Daily Precipitation ( ) |
||
Sunlight exposure |
Annual average sunlight exposure measured by solar irradiance ( ) |
Centers for Diseases Control and Prevention (https://www.cdc.gov/) |
|
UV radiation exposure |
Annual average daily dose of UV irradiance ( ) |
||
Landcover |
Landcover with water |
Percent of land covered by water |
|
Land cover with forest |
Percent of land covered by forest |
||
Disaster |
Drought |
Number of weeks of moderate drought or worse per year |
|
Flood |
Percentage of people within fema designated flood hazard area |
||
Health status |
Disability |
Percentage of population aged 5 years and over with a disability |
|
Asthma |
Percent of adults diagnosed with asthma |
||
Obese |
Percentage of adults aged 18 years and over who were obese |
||
Overweight |
Percentage of adults aged 18 years and over who were overweight |
||
Cancer |
Number of people with lung and brouchus cancer per 1000000 population |
||
Commuting to work |
Go to work by private transportation |
Percentage of workers 16 years and over who drove alone (car, truck, or van) |
United States Census Bureau (https://www.census.gov/en.html) |
Go to work by public transportation |
Percentage of workers 16 years and over who go to work by public transportation (excluding taxicab) |
||
Go to work by walking |
Percentage of workers 16 years and over who go to work by walking |
||
Work at home |
Percentage of workers 16 years and over who worked at home |
||
Mean travel time to work |
Mean travel time to work (minutes) of the workers 16 years and over |
||
Socioeconomic |
Health insurance |
Percentage of population without health insurance |
|
Householder with a mortgage |
Percentage of household with a mortgage |
||
Poverty |
Percentage of population whose income is below the poverty level |
||
Service occupations |
Percentage of employed population 16 years and over with service occupations |
||
Unemployment |
Percentage of population 16 years and over unemployed |
||
Hospital |
Number of hospitals |
Centers for Diseases Control and Prevention (https://www.cdc.gov/) |
|
Hospital beds |
Number of hospital beds per 10000 population |
||
People living in group quarter |
Percentage of population living in group quarter |
United States Census Bureau (https://www.census.gov/en.html) |
|
People living near a park |
Percentage of population living within a half mile of a park |
||
Householder with no internet access |
Percentage of households with no internet access |
||
Median household income |
|
||
Mean household retirement income |
|
||
Mean household cash public assistance income |
|
||
Mean household Supplemental Security Income |
|
||
Demographic |
Percent of males |
|
|
Median age |
|||
Percent of people under 18 years |
|||
Percent of people 65 years and over |
|||
Percent of the white race |
|||
Percent of the black or African American |
|||
Percent of American Indian and Alaska Native |
|||
Percent of Asian |
|||
Percent of native Hawaiian and other Pacific islander |
|||
Percent of Hispanic or Latino |
Table 2. The statistic of local R2 of the GW-RF in modelling COVID-19 death rate, we calculated the average value of local R2 and the percentage of counties in five local R2 range (≤0.2, (0.2, 04], (0.4, 06], (0.6, 08], >0.8)
The value of local |
GW-RF |
Average value |
0.59 |
≤0.2 |
1.1% |
(0.2, 04] |
9.5% |
(0.4, 06] |
38.9% |
(0.6, 08] |
44.8% |
>0.8 |
5.7% |
Table 3. The proportion of counties with local primary risk factor (the risk fator with the highest value of local variable importance) on COVID-19 death rate at county level in the GW-RF.
Local primary risk factor |
Proportion of counties |
Go to work by walking |
35% |
Airborne benzene concentration |
25% |
Householder with a mortgage |
13% |
Unemployment |
12% |
Other risk factors |
16% |