Model Performance
The R2 of overall, spatial and temporal CV of smoke-impacted model is 0.75 (RMSE = 4.59 µg/m3), 0.59 (RMSE = 5.88 µg/m3) and 0.67 (RMSE = 5.18 µg/m3), respectively, indicating a good model performance in fire grids. For no-smoke model, the R2 of random, spatial and temporal cross validation is 0.68, 0.47 and 0.63, with RMSE of 3.35 µg/m3, 4.30 µg/m3 and 3.59 µg/m3, respectively, which indicates the satisfactory performance from the random forest model for background PM2.5. As shown in Figure S2, random forest models slightly overestimated at low PM2.5 concentrations and underestimate at high PM2.5 values, especially when daily PM2.5 concentration exceeds 100 µg/m3. After aggregating the daily PM2.5 predictions to monthly level, the R2 of smoke-impacted and no-smoke models in the overall 20-fold CV increased to 0.84 and 0.78, respectively, indicating the bias of estimation is random. Scatter plots for aggregated monthly CV are shown in Figure S3. Same process was used for spatial and temporal CV and as a result, the R2 of both smoke-impacted and no-smoke models were improved, as shown in Table S2. After aggregating the overall CV to annual level, the R2 between all predictions and AQS measurements is 0.9, implying a high accuracy of model predictions. As for variable importance, CMAQ is the most important predictors in both smoke-impacted and no-smoke and AOD and wind are the common parameters ranked in top five in two models (Figure S4).
Spatiotemporal Patterns of Smoke PM2.5 across the CONUS
Figure 2 presents spatial distributions of annual mean smoke PM2.5 in the CONUS from 2007 to 2018. While the Western U.S. has seen a significant and more persistent impact of fire smoke on PM2.5 levels, other regions including the mid-West and the Southeast have also suffered high smoke PM2.5 in certain years. For example, annual average smoke PM2.5 concentrations over 8 µg/m3 occurred in California, Oregon, and Washington in 2007–2009, 2011, 2013, 2017 and 2018, and over 50% of the areas in these states were impacted by fire smoke during these years. Along California coasts and in the Central Valley, annual average smoke PM2.5 concentrations exceeded 12 µg/m3 in 2007, 2017 and 2018. We observed the highest annual average wildfire smoke PM2.5 level north of Ventura County in Southern California at 25 µg/m3 in 2017. Other Western states such as Idaho, Montana, Utah, Colorado, Arizona, and New Mexico have been affected to a lesser degree, with annual mean smoke PM2.5 levels ranging between 0 and 5 µg/m3. The second most affected region by fire smoke is the Southeast. For example, annual smoke PM2.5 levels up to 9 µg/m3 were common in Alabama, Georgia, and the Carolinas. Fire smoke also contributed significantly to elevated PM2.5 levels in Georgia and Florida in 2010 and 2017. In addition, air quality in the Midwestern states was periodically affected by fire smoke. For example, approximately half of Texas, Oklahoma and Kansas showed detectable fire smoke impact in 2010, 2011, and 2017, with high smoke PM2.5 levels observed over large cities such as Dallas, Austin and San Antonio. PM2.5 levels in the states around the Great Lakes and in the Northeastern U.S. have rarely been affected by fire smoke during our study period.
Conducting large-scale epidemiological studies to investigate the impact of fire smoke on human health has been challenging largely due to the difficulty in estimating spatially resolved exposure to fire smoke PM2.5. Recently, a few modeling studies of smoke PM2.5 concentrations in the CONUS have been conducted with spatial resolutions ranging from 10–15 km (17, 39). Using machine-learning models such as those presented in this study allows the integration of CTM fire simulations, high-resolution satellite remote sensing of fire smoke, and the broader spatial representation of the PurpleAir sensor network to achieve high spatial resolution (1 km), high temporal resolution (daily), and full coverage of the CONUS for a 12-yr period. The temporal trend and spatial characteristics of our model-predicted smoke PM2.5 concentrations align with major fire events across the country. For example, data from the National Interagency Fire Center (40) showed that fire activities in Southern California, eastern Texas, and southern North Carolina and Tennessee in 2007 were 125% and 121% of previous 10-year average, respectively. The acres burned in the Rocky Mountains were 367% and 351% of previous 10-year average in 2012 and 2017, respectively, and our model successfully capture these features. Compared with uncalibrated CMAQ simulations of smoke PM2.5 (Figure S5 Panel A), our predictions better represent the spatial and temporal distribution of fire smoke. For instance, our model captured the high smoke PM2.5 values in the West and Southeast during the extreme fire years, such as 2007 and 2018 (Fig. 1), and low smoke PM2.5 values in 2015, which have same temporal trend as reported by National Interagency Fire Center (40). In addition, our model was able to capture finer spatial features more accurately due to its high spatial resolution at 1 km. Compared with previous smoke PM2.5 estimations with coarse resolution, our predictions provided a clearer boundary of the smoke impacted areas and captured detailed variability of population exposure levels. As illustrated in Figure S6, population within an area of 100 km2 in Sacramento, California were able to be assigned to 100 unique smoke PM2.5 values based on their locations rather than one average value, which offers the feasibility for high-resolution health impact studies.
To our best knowledge, our study is the first large-scale attempt to use calibrated PM2.5 concentration measurements from low-cost sensors such as PurpleAir monitors in conjunction with AQS monitors to better characterize the spatial variability of smoke PM2.5. Previous research has shown that low-cost sensor measurements can increase the likelihood of detecting wildfire smoke (21, 22), and integrating low-cost sensor data with regulatory measurements has allowed for better training of satellite-based machine learning models for identifying air pollution hotspots (26, 41). In our study, PurpleAir sensors reported extreme PM2.5 concentrations over 200 µg/m3 during the Camp fire in California, while the highest AQS measurement was approximately 100 µg/m3 as there were no AQS monitors located near the smoke plumes. Including the high PM2.5 measurements from PurpleAir in our training dataset reduced the model underestimation on high PM2.5 values. For instance, the smoke PM2.5 prediction from models without PurpleAir (Figure S5, Panel B) was biased low in California where high smoke PM2.5 values always occurred and the difference of annual smoke PM2.5 predictions between models with and without PurpleAir measurements reached up to 16 µg/m3 in 2018. Unlike earlier studies which attributed the deviation from background levels of PM2.5 to smoke using ground total PM2.5 measurements, satellite-based smoke plume identification, and air trajectories (17, 39), we employed two different CMAQ simulations, with and without fire emissions, along with satellite-based HMS smoke contours to more accurately label smoke impacted areas and days. Our approach facilitates independent modeling of both background PM2.5 and total PM2.5 accounting for smoke impact nationwide.
Effect of Fire Smoke on National PM2.5 Concentration Levels
Using our daily model predictions, we assessed the impact of fire smoke on the regulatory air quality monitoring network. We defined a smoke impact day as when fire smoke contributed more than 25% of model-estimated daily total PM2.5 mass concentration at the location of an air quality monitoring station included in the EPA AQS. Daily PM2.5 concentration at ~ 40% of the 1836 AQS monitoring sites have been significantly affected by smoke for more than a month each year during our study period (Fig. 3). In 2009 and 2010 when our model predicted the lowest smoke impact on national PM2.5 levels, over 25% of the national ambient PM2.5 monitoring network was under significant smoke impact for more than a month. In intensive fire years such as 2017, 50% of all monitoring locations were affected for at least a month, indicating a widespread impact at the national scale. During the worst fire year of 2007, 25% of all monitoring locations were affected for more than 90 days. Smoke impact on air quality was highest in summer and fall in most years. However, in low fire years such as 2009 and 2010, fire smoke had the greatest impact in spring and fall.
AQS’s Representativeness of Population Exposure to Fire Smoke
Using our model predictions and annual population estimates at 1 km resolution, we estimated the U.S. population affected by fire smoke. As shown in Table 1, nearly the entire population in the CONUS, ranging from 95% in 2018 to 100% in 2007, has been exposed to fire smoke. On average, a slightly higher percentage of people living outside the vicinity of an EPA AQS monitoring station (defined by a 5 km radius) has been exposed to fire smoke. The average duration of population exposure to fire smoke showed a more substantial difference. On average, people living outside the vicinity of an AQS monitoring station experienced 25.2 smoke impact days, 36.5% (ranging from − 8% in 2018 to 70% in 2012) greater than people living near an AQS station. While the mean model estimated total PM2.5 concentration in regions near an AQS station (10.79 µg/m3) is significantly higher than that in regions without AQS coverage (8.87 µg/m3), estimated smoke PM2.5 concentration shows the opposite (0.50 µg/m3 vs. 0.65 µg/m3). Since the majority of AQS stations are located in urban areas, these findings suggest that using EPA observations alone may substantially underestimate both the duration and the concentration of the fire smoke exposure of the rural and suburban population.
Table 1
Fire smoke impact on the U.S. population.
Year | Total Population (Population without AQS coverage) (million) | Smoke Impacted Total Population (Smoke Impacted Population without AQS coverage) (million) | Smoke Impact Days among Population with AQS coverage (among Population without AQS coverage) | Total PM2.5 (Smoke PM2.5) with AQS coverage | Total PM2.5 (Smoke PM2.5) without AQS coverage |
2007 | 300.1 (73.6) | 299.7 (73.6) | 38.2 (54.6) | 11.90 (0.96) | 9.87 (1.11) |
2008 | 302.6 (74.1) | 298.3 (72.6) | 21.2 (22.4) | 10.42 (0.32) | 8.26 (0.38) |
2009 | 305.5 (70.9) | 300.3 (69.9) | 13.5 (19.4) | 11.20 (0.25) | 8.45 (0.22) |
2010 | 307.0 (72.0) | 285.7 (71.6) | 12.8 (22.6) | 10.83 (0.57) | 9.73 (0.77) |
2011 | 310.0 (72.9) | 307.9 (72.7) | 16.4 (25.7) | 11.43 (0.51) | 9.14 (0.73) |
2012 | 299.9(72.0) | 289.0 (71.6) | 11.9 (20.3) | 10.35 (0.53) | 9.28 (0.83) |
2013 | 313.1 (74.0) | 308.0 (72.9) | 16.7 (19.1) | 11.57 (0.61) | 9.34 (0.66) |
2014 | 317.3 (74.6) | 310.8 (74.3) | 16.9 (22.5) | 9.40 (0.31) | 8.74 (0.40) |
2015 | 319.8 (74.9) | 313.2 (74.6) | 14.4 (19.2) | 9.37 (0.48) | 7.91 (0.64) |
2016 | 321.5 (74.9) | 319.7 (74.9) | 17.6 (25.0) | 9.30 (0.31) | 7.98 (0.48) |
2017 | 324.1 (74.6) | 321.6 (74.5) | 26.2 (33.8) | 11.22 (0.97) | 8.78 (0.92) |
2018 | 325.6 (74.8) | 308.9 (73.1) | 20.2 (18.5) | 10.51 (0.61) | 8.95 (0.65) |
Average | 312.2 (73.6) | 305.3 (73.0) | 18.8 (25.2) | 10.79 (0.50) | 8.87 (0.65) |
Impact of Fire Smoke on Attainment Status with the Proposed New PM2.5 Standard
In January 2023, the U.S. EPA proposed to lower the NAAQS for annual mean PM2.5 concentrations, calculated as the average of past three years, to a value between 9 µg/m3 and 10 µg/m3. We estimated the total population as well as the number of AQS monitoring sites which would reside in nonattainment areas under the new standard (Table S3 and S4). Without considering in the impact of fire smoke, an average of 116.83 million people (from 68.73 million in 2016 to 148.74 million in 2013) and 30% of all AQS monitoring sites (from 15% in 2017 to 40% in 2011) in the CONUS would be in areas with annual mean PM2.5 concentrations equal to or above 10 µg/m3. When we considered the fire smoke contribution to PM2.5 levels, an additional 21.4 million people and 6% of AQS monitors would reside in nonattainment areas. Under the stricter standard of 9 µg/m3, the average affected population would increase to 167.23 million without considering the effect of fire smoke, and 197.68 million (ranging from 153.73 million in 2016 to 225.27 million in 2013) with the contribution of fire smoke. Regarding air quality monitoring, an average of 41% of all AQS monitoring sites would fall into nonattainment areas. When the contribution of fire smoke was considered, this percentage rose to 50% (ranging from 37% in 2016 to 58% in 2011 and 2012).
As the increasing regulation of emissions of PM2.5 and its precursors from anthropogenic sources have effectively improved air quality in most parts of the US, fire emissions are becoming a major contributor of PM2.5. The proximity of large populations to wildland fires poses a nontrivial threat to public health and compliance with ambient air quality standards. According to EPA (42), approximately 20.9 million Americans (2010 population) reside in PM2.5 nonattainment areas based on the current NAAQS as of 2023. Our model estimated that 95.9 to 146.3 million more people would live in nonattainment areas if the annual mean PM2.5 NAAQS were lowered to between 9 and 10 µg/m3. Our calculations also suggested that taking the impact of fire smoke into account would result in an additional 21.4 to 30.5 million people falling into nonattainment areas. As most wildland fires start in rural areas, fire smoke PM2.5 would disproportionally affect the suburban and rural populations. The comprehensive spatial coverage of our model estimates would enable future research on the differential health effects of air pollution exposure associated with the altered PM2.5 composition in these communities.