Measuring Economic Activity
Comparison of weekly PM2.5 levels (top) and local unemployment levels (bottom) in Fresno, California in March 2020
Since GDP is recorded over three months and we wanted more frequent data, we sought an economic proxy that could act as a metric of economic activity concerning PM2.5 levels. After utilizing K-Nearest Neighbors Imputer to predict unemployment levels by county, we plotted weekly mean PM2.5 values against weekly unemployment values. The month of March experienced a significant change in unemployment and pollution with a 45% increase in unemployment and a 74% decrease in PM2.5 for the county of Fresno (Fig. 3). The results in Fig. 3 indicate an inverse relationship between unemployment and PM2.5 for the month of March.
Table 1
Correlation coefficients showing the statistical relationship between weekly PM2.5 and local unemployment levels
County | Correlation Coefficient |
Contra Costa | -0.4435 |
Napa | -0.60851 |
Sacramento | -0.66302 |
Sutter | -0.68531 |
The relationship can also be quantitatively established by running a correlation coefficient test on weekly pollution and unemployment data from February 16, 2020, to May 31, 2020. As shown in Table 1, the negative correlation coefficient values indicate that there is an inverse relationship between these two variables with some counties, such as Napa, Sacramento, and Sutter, exhibiting a stronger inverse relationship between unemployment and pollution than other counties, such as Contra Costa.
Evaluating the Economic Viability of Certain Industries
Total employed persons for treatment and control industries. Using an algorithm to predict an industry’s total employment, graphs for total employed persons for a. all control, b. automotive, c. construction, d. mining, e. oil, f. paper, and g. wood industries were created
After affirming that unemployment could be a measure of economic activity on a county level, we evaluated the economic health of several state industries using employment data. We developed a model to predict total employed persons for specific industries weekly using monthly employment data provided by the California Employment Department [14], producing a richer data set. These values were then plotted in Fig. 4, comparing the employment values of treatment industries, or industries that could be affected by the rollbacks, to those of control industries, or industries that could be unaffected by rollbacks.
Data was then split into two categories: pre-shutdown and rollback, or weekly employment data that occurred before March 31, 2020, and post-shutdown and rollback, or weekly employment data that occurred after April 1, 2020. California issued a statewide stay-at-home order on March 19, 2020, and the EPA rolled out new guidelines on March 26, 2020.
Two two-sided paired t-tests, one using pre-shutdown and pre-rollback data and the other using post-shutdown and post-rollback data, were conducted using the employment values of a treatment industry and those of control industries. For these two t-tests, our null hypothesis stated that total employed persons for treatment and control industries were not statistically different before or after COVID and rollbacks. Our alternative hypothesis stated the opposite: the employment values for treatment and control industries were statistically significant before and after COVID and rollbacks. Both paired t-tests produced statistically significant values (p < 5 ✕ 10− 11) for each of the 6 treatment industries, possibly indicating that the employed persons of treatment industries are indeed different from those of control industries before and after the state shutdown and EPA rollback.
A two-sided unpaired t-test was conducted comparing pre-shutdown and pre-rollback employment data to post-shutdown and post-rollback data for the control industries and each of the 6 treatment industries. The null hypothesis stated that there was no statistically significant difference in an industry’s employed persons after shutdown and rollbacks, while the alternative hypothesis indicated that there was a statistically significant difference in an industry’s employment values after shutdown and rollbacks. As shown in Table 2, most industries, including the control industries, turned up statistically significant (p < 0.05). On the other hand, other industries, such as Construction and Wood Manufacturing, produced p-values that were not statistically significant (p > 0.05) (Table 2).
Table 2
P-values of employed persons before and after shutdown and rollbacks (two-sided unpaired t-test)
Industry | P-value |
Control | 0.0001803 |
Automotive | 0.0001606 |
Construction | 0.319 |
Mining | 0.003052 |
Oil | 0.009941 |
Paper | 0.0001068 |
Wood | 0.4355 |
Evaluating Air Pollution in Treatment-Control Pairs
We collected daily PM2.5 data [16] from the oil refinery treatment-control pair (i.e. Contra Costa and Sacramento) and manufacturing treatment-control pair (i.e. Napa and Sutter). This data was averaged weekly and later split up into pre-shutdown and pre-rollback data, 2019 spring or early 2020 data, and post-shutdown and post-rollback data, data after April 1, 2020.
Table 3
P-values of pre-shutdown and pre-rollback PM2.5 values for treatment-control counties (two-sided paired t-test)
Frequency | Treatment-Control Pair |
Contra Costa - Sacramento | Napa - Sutter |
Weekly | 0.003271 | 0.7945 |
Daily | 2.60E-07 | 0.621 |
PM2.5 values before March 31, 2019, for a treatment and control county were used as pre-shutdown and pre-rollback data and inputted into a t-test. Weekly and daily data samples were used
Weekly spring and summer PM2.5 levels from 2019 were used as pre-shutdown and pre-rollback data. Pre-shutdown and rollback data for a treatment county were compared to those of its control county in a two-sided paired t-test in R. The null hypothesis stated that the type of industry present in the treatment county was not associated with PM2.5 levels. The alternative hypothesis stated that the presence of the industry in the treatment county was associated with PM2.5 levels. As shown in Table 3, the treatment-control pair for oil industries produced statistically significant values using weekly and daily data (p < 0.05). On the other hand, the manufacturing treatment-control pair did not test statistically significant (p > 0.05) (Table 3).
Table 4
P-values of pre- and post-shutdown and rollback PM2.5 levels (two-sided unpaired t-test)
Frequency | County |
Contra Costa | Sacramento | Napa | Sutter |
Weekly | 0.3349 | 0.003492 | 0.1401 | 0.7499 |
Daily | 0.1459 | 1.48E-08 | 0.01304 | 0.7815 |
Data from April 1, 2019, to June 30, 2019, was used as pre-shutdown and pre-rollback data. Data from April 1, 2020, to June 30, 2020, was used as post-shutdown and post-rollback data. Pre- and post-shutdown and rollback were compared for each county in a t-test. Daily and weekly data samples were used.
Weekly spring and summer PM2.5 data from 2019 and 2020 were used as pre- and post-shutdown and rollback data, respectively. Pre-shutdown and pre-rollback data were compared to post-shutdown and post-rollback data for each county in an unpaired two-sided t-test. The null hypothesis stated that a county’s PM2.5 levels did not change after the state shutdown and EPA rollbacks. The alternative hypothesis stated that PM2.5 levels did change after the state shutdown and EPA rollbacks. As shown in Table 4, Sacramento consistently turned up statistically significant (p < 0.05) while Contra Costa did not (p > 0.05).
Table 5
P-values of post-shutdown and post-rollback PM2.5 data for treatment-control pairs (two-sided paired t-test)
Frequency | Treatment-control Pair |
Contra Costa - Sacramento | Napa - Sutter |
Weekly | 6.17E-07 | 0.08321 |
Daily | 2.20E-16 | 0.0009928 |
Data from April 1, 2020, to June 30, 2020, were used as post-shutdown and post-rollback data for each treatment and control county and were inputted into a t-test. Daily and weekly samples were used
Spring and summer PM2.5 data 2020 was used as post-shutdown and post-rollback data. Post-shutdown and post-rollback data for the treatment and control county were compared in a two-sided paired t-test. The null hypothesis stated that the PM2.5 levels from the treatment county were not statistically significant to those of its control county. The alternative hypothesis stated that the PM2.5 levels of the treatment county were statistically significant to those of its control county. As shown in Table 5, the oil refinery treatment-control pair produced statistically significant p-values (p < 0.05) using weekly and daily data (Table 5). However, the manufacturing treatment-control pair produced p-values that were not statistically significant (p > 0.05) (Table 5).
Weekly PM2.5 Levels Across all 4 counties from January 2020 to June 2020
In Fig. 5, we noticed that the control of the oil industry treatment-control pair (Sacramento) had higher PM2.5 levels than its treatment (Contra Costa) before the state shutdown and EPA rollbacks. However, we noticed that this behaviour is switched after the state shutdown and EPA rollbacks: Contra Costa now has higher levels of PM2.5 than Sacramento. This behaviour is not the same for the manufacturing treatment-control group. Sutter has higher PM2.5 levels than Napa but this difference is not as consistent after shutdown and rollbacks.
Table 6
P-values of PM2.5 differences before and after shutdown and rollbacks (two-sided paired t-test)
Treatment-Control Pair | p-value |
Contra Costa-Sacramento | 0.0002155 |
Napa-Sutter | 0.04157 |
From January 1, 2020, to June 30, 2020, a control county’s PM2.5 levels were subtracted from a treatment county’s PM2.5 levels. Data before March 31, 2020, became pre-shutdown and pre-rollback data while data after April 1, 2020, became post-shutdown and post-rollback data. For each treatment-control pair, pre- and post-shutdown and rollback data were inputted into a t-test.
To determine whether the differences in PM2.5 levels for the treatment and control county before and after shutdown and rollbacks were statistically significant, we subtracted the PM2.5 values of a control county from those of its treatment county from January 1, 2020, to June 30, 2020. Data from January 1, 2020, to March 31, 2020, acted as pre-shutdown and pre-rollback data, while data from April 1, 2020, to June 31, 2020, acted as post-shutdown and post-rollback data. These differences were then inputted into an unpaired two-sided t-test. The null hypothesis states that the difference between the treatment-control pair before and after the state shutdown and EPA rollbacks were not statistically significant, whereas the alternative hypothesis states that the differences after the state shutdown and EPA rollbacks were statistically significant. As shown in Table 6, the oil industry treatment-control pair has a higher statistically significant p-value (p < 0.0001) than the p-value for the manufacturing treatment-control pair.