Feasibility of Low-Cost Particle Sensor Types in Long-Term Indoor Air Pollution Health Studies After Repeated Calibration Over a 2-Year Timeframe

Background Previous studies have explored using calibrated low-cost particulate matter (PM) sensors, but important research gaps remain regarding long-term performance and reliability. Objective Evaluate longitudinal performance of low-cost particle sensors by measuring sensor performance changes over 2 years of use. Methods 51 low-cost particle sensors (Airbeam 1 N=29; Airbeam 2 N=22) were calibrated four times over a 2-year timeframe between 2019-2021. Cigarette smoke-specic calibration curves for Airbeam 1 and 2 PM sensors were created by directly comparing simultaneous 1-min readings of a Thermo Scientic Personal DataRAM PDR-1500 unit with a 2.5 µm inlet. Results Inter-sensor variability in calibration coecient was high, particularly in Airbeam 1 sensors at study initiation. Calibration coecients for both sensor types trended downwards over time to <1 at nal calibration timepoint [Airbeam 1 Mean (SD)= 0.87 (0.20); Airbeam 2 Mean (SD) = 0.96 (0.27)]. We lost more Airbeam 1 sensors (N=27, failure rate 48.2%) than Airbeam 2 (N=2, failure rate 16.7%) due to electronics, battery, or data output issues. depends upon particle size and humidity. 24 Thus, low-cost sensors require routine calibration in the laboratory with the PM 2.5 and environmental conditions of interest. Our current analysis provides a robust assessment of the longitudinal utility of low-cost particle sensors. Previous studies have measured the utility of low-cost particle sensors for PM monitoring where reference-standard equipment is not available or feasible, and for improving the study of spatially localized airborne PM concentrations. One study conducted in the United Kingdom evaluated the performance of four models of low-cost PM sensors, and examined inter-model performance across 19 different particle sensor units. Despite differences in the way each sensor type derived PM concentrations, the researchers found general agreement in PM readings across sensor types. 25 Another study evaluated the performance of two widely-used particle sensors, the Plantower PMS A003 and Shinyei PPD42NS, and developed PM calibration models for seven different metropolitan areas (i.e., Los Angeles, Chicago, New York, Baltimore, Minneapolis-St. Paul, Winston-Salem and Seattle) using a sample of 72 sensors. The authors found that good calibration models were feasible only with the Plantower PMS A003 model after running simulations for region-specic models. 5 Another study found that a Plantower PMS 1003 sensor provided reliable PM data outputs over a 13-month period. 10 Our study extended this time period to over 2 years of reliable output from Plantower PM sensors (albeit a different model), although the reproducibility of the calibration coecients varied by individual units over time. One of the largest programs of low-cost sensor use is currently underway with the U.S. EPA’s AirNow network of low-cost PurpleAir sensors for the nationwide monitoring of wildre-generated PM (https://www.airnow.gov/res/using-airnow-during-wildres/). As demonstrated in our study of Airbeam sensors, the PurpleAir sensors report PM levels that differ from more expensive and reliable monitoring instruments, but these offsets can be corrected by a ‘correction equation.’ The underlying design of the PurpleAir device is based on the fact that low cost sensors may degrade over time and therefore the PurpleAir device evaluates individual sensor degradation by continually comparing the output of two low-cost Plantower PM sensor units built into each


Introduction
Studies of air pollution-associated health impacts often require measuring ambient concentrations of air pollutants. While monitoring of PM 2.5 concentrations has contributed to understanding and reducing ambient PM 2.5 to improve air quality standards, rigorous measurement of indoor air pollution remains a challenge.
Conventionally, the measurement of ambient PM 2.5 concentrations requires either a labor-intensive gravimetric lter-based method with size-speci c inlets, or sophisticated and approved real-time instruments. Such equipment is expensive and not readily portable, thus limiting the number of locations that can be sampled within a given time. Central monitoring networks that use advanced instruments utilizing gravimetry, light scattering, or beta attenuation have been mounted by states and federal agencies to address efforts to achieve federal national ambient air quality standards (NAAQS). While central monitoring is obviously important for monitoring PM 2.5 exposures within the microenvironments of cities, neighborhoods, buildings, and even homes, personal monitoring is considered the optimal approach to assess an individual's exposure levels to PM 2.5 . 1 Until recently, monitoring indoor settings at a high spatial and temporal resolution was impractical due to the cost, size, and expertise needed to operate monitoring equipment. Real-time personal monitoring or multi-location monitoring is, however, not a new concept. Innovations in air quality monitoring have addressed cost in the last decade. A combination of technological advancements (cheaper electronic boards and smaller light scattering sensors), public interest in air pollution, and the increased popularity of citizen science have resulted in the development and proliferation of low-cost PM 2.5 sensors and devices. These low-cost sensors have gained in popularity for a range of uses from home and personal monitoring to citizen science and to larger scale academic research. [2][3][4] The advantages of low-cost PM 2.5 sensors for research include: 1) deployment in large numbers to increase spatial and temporal coverage; 2) ease of use and maintenance; and 3) a battery power source that permits remote or portable use. [5][6][7] In addition, they can be connected via Wi-Fi or Bluetooth technology to transmit data, sometimes in real time, to central servers and crowdsourcing platforms to share data and cover large geographic areas with extended spatial and temporal resolution. However, these simple, low-cost sensors have limitations and require routine testing and calibration prior to use in scienti c studies. Much work has been done in recent years to address these limitations, and results have demonstrated that low-cost sensors generally have acceptable reliability but also technological limitations and inter-instrument variability. 2,7 A key nding of many of these studies is that important research gaps remain regarding durability and the need for calibration of individual units prior to use for research. 5,[7][8][9] Few studies, for example, have examined the performance of a network of low-cost sensors over an extended period of time. One such study showed that some PM 2.5 sensors were relatively stable over time when tested over a year, however that study focused on measurements in outdoor environments with concentrations ranging from 6 -41 µg/m 3 . 10 In tandem, public housing authorities (PHAs) have been federally mandated to implement smoke-free housing (SFH) policies in their developments. [11][12][13] Despite policy implementation in July 2018, there is still some evidence of cigarette smoking within New York City Housing Authority (NYCHA) developments. 11 Stemming from a larger, quasi-experimental study evaluating the impact of SFH policies on secondhand smoke exposure in select NYCHA buildings, we utilized a network of low-cost sensors to evaluate indoor PM. This current analysis sought to assess whether rigorous calibration allows low-cost sensors to be used for indoor air quality measurements in the eld for long periods of time without degradation in reliability. To achieve this objective, we repeatedly calibrated and utilized a large number of low-cost rst and second generation Airbeam PM 2.5 sensors, over a 2-year period, to assess PM 2.5 concentrations in urban high-rise buildings with a focus on measuring indoor tobacco smoke.

Generation of Calibration Curves for Cigarette Smoke:
Cigarette smoke-speci c calibration curves for the Airbeam 1 and 2 PM 2.5 sensors were created in a laboratory setting via the direct comparison of the output of the low cost Airbeam sensors with simultaneous 1-min readings produced by a factory-calibrated Thermo Scienti c Personal DataRAM PDR-1500 unit with a 2.5 µm inlet (Thermo Environmental Instruments, Waltham, MA). The PDR-1500 unit is a widely used instrument and shown to be reliable from previous studies. [14][15][16][17][18][19][20][21][22] Over the course of the 2-year period, our low-cost sensors were calibrated four times using the same PDR-1500 unit, where the internal lter was checked to control the real-time measurements gravimetrically. The Airbeam 1 and 2 devices utilize two low-cost sensors: The Shinyei PPD60PV and Plantower PMS 7003 infra-red light scattering particle sensors, respectively. The PDR 1500 unit was zeroed with particle-free air prior to each run.
To perform the calibration, 8-12 Airbeam units were placed into an airtight stainless-steel chamber, where temperature is room temperature and humidity matches the building's at below 50%, with access ports permitting the introduction of cigarette smoke or HEPA ltered air. The PDR-1500 was connected to a sampling port for measuring the PM 2.5 concentrations inside the chamber. This instrument has both an inlet and outlet where tubes are connected to inject cigarette smoke into the chamber; the PDR-1500 was not placed inside the chamber to prevent contamination resulting from its enclosure with cigarette smoke. A smoking machine (Borgwaldt, Hamburg, Germany) was used to inject fresh mainstream cigarette smoke using 3R4F reference cigarettes into the chamber until the PDR-1500 registered a particle mass concentration greater than 1,000 µg/m 3 . A high concentration value such as 1,000 µg/m 3 exceeds the upper limit for PM 2.5 values for both low-cost particle sensor types. Airbeam 1 and Airbeam 2 sensors have different saturation points at 80 µg/m 3 and 200 µg/m 3 , respectively (i.e., the light scattering derived PM 2.5 output plateaus), ensuring the decreasing PM 2.5 calibration curve would begin above their detection ceiling (approximately 180 µg/m 3 and 800 µg/m 3 , respectively). After cigarette smoke generation was stopped, the sample pump and internal lter of the PDR-1500 slowly removed cigarette smoke from the chamber which was replaced by HEPA-ltered room air. The resulting time-dependent decrease in PM 2.5 was used to develop the calibration curve. The start times of the Airbeam units and PDR-1500 particulate matter readings were synchronized, and the 1 min outputs were recorded beginning above the nominal upper detection limit and continued until the PDR-1500 values stabilized in the low single digit µg/m 3 range. Each run lasted approximately one hour.
Readings from each Airbeam (X-axis) were matched by synchronized timestamp with the corresponding values from the PDR-1500 (Y-axis). Using Excel, a unique calibration equation for each Airbeam unit was calculated by linear regression up to 80 µg/m 3 which was the expected upper limit for indoor PM 2.5 . Polynomial regression models were also generated; however, the output of these models was linear up to 80 µg/m 3 , which strengthened our decision to use linear models. Each unique equation and accompanying R value was recorded and assigned to the unit by serial number. Because of differences in the sensor type's output, for consistency, we calculated calibration coe cients using the total PM reading from Airbeam 1 sensors, and the PM 10 output from Airbeam 2 sensors. Both sensor types use an algorithm based on an internal equation to generate PM output; Airbeam 1 sensors do not have the split for PM 1 , PM 2.5 and PM 10 values. The calibration coe cient for cigarette smoke was developed as a multiplication factor to correct the Airbeam PM 2.5 output and calculated as: Calibration Coe cient = slope of the PDR-1500 (Y-axis) vs Airbeam (X-axis) calibration curve To assess the effect of particle composition on the calibration curve, the Airbeam devices were also calibrated using airborne particles in the NYC subway system. As in the cigarette smoke calibration procedure, the output of four Airbeam 1's and four Airbeam 2's was compared to the PDR 1500 PM 2.5 output and a calibration coe cient was calculated for subway PM 2.5 .

Field Sampling Periods
We calibrated 51 low-cost particle sensors (Airbeam 1 generation N=29; Airbeam 2 generation N=22) at 4 different timepoints over a 2-year period spanning from 2019 to 2021. After each laboratory calibration, the Airbeam units were deployed in a large, natural experiment evaluating the impact of new smoke-free housing (SFH) policies on air quality in public housing units every 6 months. 11,23 Due to the onset of the COVID-19 pandemic, we were unable to perform Airbeam sensor calibration 24 months post-SFH policy implementation (April-September 2020). A calibration technical error for select Airbeam 2 sensors occurred at 30 months post-SFH policy (December-March 2021), leading to their exclusion from data analysis at that timepoint. We then calibrated all 51 Airbeam sensors at 36 months post-SFH policy (May-September 2021) to obtain a nal calibration coe cient.

Data Analysis:
We descriptively tabulated the mean (SD) calibration coe cients at four different 6-month timepoints over a 2-year period from 2019 to 2021 for the two different Airbeam sensor types. We performed independent t-tests to measure statistically signi cant differences in calibration coe cient means between particle sensor types, and also characterized the between-and-within variability for calibration coe cient measurements. Because the light scattering properties of airborne particles are in uenced by particle composition, we compared the mean (SD) calibration coe cients for cigarette smoke and subway PM 2.5 using an independent ttest. Lastly, we used a difference-in-difference (DID) approach to compare within-group changes between Airbeam 1 and Airbeam 2 sensors across four different calibration timepoints. Regression models included xed effects for particle sensor type (Airbeam 1 vs Airbeam 2 sensors) and data collection timepoints (12, 18, 30 and 36 months post-SFH policy implementation 24 ). We adjusted for the clustering of individual Airbeam IDs and repeated measures overtime. Model-based mean differences with 95% con dence intervals were calculated for each particle sensor type over time. P-values were reported after implementation of the independent ttests, with a signi cance level set at p<0.05, using a two-sided test. All analyses were performed using SAS statistical software, version 9.4 (SAS institute).
We examined the individual time trends in calibration coe cient measurements for low-cost particle sensors over a 2-year period, grouped by particle sensor type (Supplemental Figure S1), and descriptively categorized all low-cost particle sensors that were taken out of circulation over the 2-year period (Supplemental Table S1). We then examined the correlation between the number of unique instances of use for individual Airbeam sensors, and their nal calibration coe cients at the end of the 2-year period (Supplemental Table S2 and Supplemental Figure S2).

Sample Characteristics
We conducted a descriptive characterization of the mean (SD) calibration coe cients at four different timepoints over a 2-year timeframe from 2019 to 2021 (Figure 1). At our rst timepoint, our sample included a total of N=56 Airbeam 1 sensors and N=24 Airbeam 2 sensors. We observed more equipment failure over time in Airbeam 1 sensors (n=27, failure rate 48.2%) than in Airbeam 2 sensors (n=2, failure rate 16.7%). These equipment failures occurred for a variety of reasons including, but not limited to: cockroach infestations, not recording data properly (i.e., inconsistent relative humidity, temperature, or PM outputs), reading null values in PM measurements, and failure during calibration (Supplemental Table S1). As a result, our effective sample size decreased to N=37 Airbeam 1 sensors and N=21 Airbeam 2 sensors at the second timepoint, and N=29 Airbeam 1 sensors and N=22 Airbeam 2 sensors at the third and fourth timepoints. We thus restricted these analyses to the N=29 Airbeam 1 sensors and N=22 Airbeam 2 sensors available across all 4 calibration time points. The PM 2.5 concentration readout of Airbeam PM 2.5 sensors was less than that of the PDR-1500 reference instrument at each calibration timepoint.

Between-and-Within Variability in Calibration Coe cients for low-cost particle sensor types
On an individual unit basis, we observed a high degree of inter-sensor variability in calibration coe cients across both low-cost particle sensors types over a 2-year timeframe ( Figure 1). There was a notable decline in Airbeam calibration coe cients consistent across both low-cost particle sensor types, with greater inter-monitor variability observed in Airbeam 1 sensors at the rst calibration timepoint and in Airbeam 2 sensors at the fourth calibration timepoint. During the second calibration timepoint, the mean (SD) calibration coe cient for Airbeam 1 sensors (1.14 (0.22)) was lower compared to Airbeam 2 sensors (1.59 (0.25)) (p<0.0001). Because of the technical errors in the Airbeam 2 calibrations, a calibration coe cient mean was determined only for Airbeam 1 sensors (1.19 (0.34)) during the third calibration timepoint. Overall, Airbeam 2 sensors fared better over the 2-year timeframe compared to Airbeam 1 sensors.

Least Square Mean Differences in Calibration Coe cients for low-cost particle sensor types
We conducted a DID model approach for repeated measures, and characterized the least square mean differences [(MD (95% CI)] spanning from 2019 to 2021, for each low-cost particle sensor type (Table 1)

Comparison of Calibration Coe cients for Cigarette Smoke versus Subway Particulate Matter
Because particle composition can affect light scattering properties, we characterized the calibration coe cients for two different particle types at a single timepoint ( Table 2) and compared the calibration coe cients for cigarette smoke vs. particulate matter (PM 2.5 ) present in subway stations. The calibration coe cients of 1.79 (0.76) for cigarette smoke and 1.22 (0.39) for subway PM 2.5 were not statistically different (p=0.08).

Correlation between Unique Instances of Use and Final Calibration Coe cient for Individual low-cost particle sensor types
To determine if sensor usage affected Airbeam output over time, we characterized the unique instances of use (i.e., the number of 7-day indoor sampling periods that a sensor was used), and the nal calibration coe cient for all 51 individual Airbeam sensors (Supplemental Table S2). We examined the correlation between the number of 7-day indoor sampling periods that an individual sensor was used, and its nal calibration coe cient at the fourth calibration timepoint (Supplemental Figure S2). We did not observe a strong correlation for Airbeam 1 sensors (R 2 = 0.16) or for Airbeam 2 sensors (R 2 =0.09). The slope of the curve for Airbeam 1 suggests that the more the sensors were used, the greater the deviation of its output from the PDR-1500's output, while the Airbeam 2 curve suggested no change with an increase in usage.

Discussion
To our knowledge, this analysis is one of the rst long-term longitudinal assessments of performance and reliability of low-cost particle sensors in measuring indoor tobacco smoking. We observed a high degree of inter-sensor variability across both particle sensor types, particularly in Airbeam 1 sensors at the study's initiation. Change in calibration coe cients over time for individual Airbeam units was detected, suggesting a degradation of low-cost particle sensors for longitudinal assessment (Supplemental Figure S1). There were also notable downward trends in calibration coe cients over time, whereas the accompanying R value for both Airbeam 1 and 2 sensors was below 1 at the nal calibration timepoint. Findings lend support to the conclusion that the routine calibration of individual Airbeam units might help to improve their utility and performance over time. Overall, Airbeam 2 particle sensors fared better than Airbeam 1 sensors, suggesting greater durability of Airbeam 2 sensors for longitudinal assessment.
Our ndings suggest that low-cost particle sensors might be differentially subjected to degradation, seen in the greater loss of Airbeam 1 sensors than Airbeam 2 sensors over time. While two of these failures, and the loss of units, resulted directly from the public housing environments (i.e., roach infestation), other failures were more generally concerning for the use of non-calibrated low-cost particle sensors for longitudinal assessment of air quality. Interestingly, we did not observe a strong correlation between the unique instances of eld use of sensors over the 2-year period and their nal calibration coe cients measured at the fourth calibration timepoint, suggesting that low-cost sensor degradation over time might be more contingent on particle sensor type, rather than individual sensor usage.
Surprisingly, the calibration coe cients were not different for cigarette smoke or subway PM (primarily combustion products and iron-rich friction particles, respectively), suggesting that the light scattering physics of these low-cost particle sensors was not signi cantly affected by these two particular particle types. Our nding of similar calibration coe cients was limited to two particle types; however, and further studies would be needed to assess a range of particles with different compositions. Other researchers have observed that, in addition to particle composition, the accuracy of PM 2.5 sensor output also depends upon particle size and humidity. 24 Thus, low-cost sensors require routine calibration in the laboratory with the PM 2.5 and environmental conditions of interest.
Our current analysis provides a robust assessment of the longitudinal utility of low-cost particle sensors. Previous studies have measured the utility of low-cost particle sensors for PM monitoring where reference-standard equipment is not available or feasible, and for improving the study of spatially localized airborne PM concentrations. One study conducted in the United Kingdom evaluated the performance of four models of low-cost PM sensors, and examined inter-model performance across 19 different particle sensor units. Despite differences in the way each sensor type derived PM concentrations, the researchers found general agreement in PM readings across sensor types. 25 Another study evaluated the performance of two widely-used particle sensors, the Plantower PMS A003 and Shinyei PPD42NS, and developed PM calibration models for seven different metropolitan areas (i.e., Los Angeles, Chicago, New York, Baltimore, Minneapolis-St. Paul, Winston-Salem and Seattle) using a sample of 72 sensors. The authors found that good calibration models were feasible only with the Plantower PMS A003 model after running simulations for region-speci c models. 5 Another study found that a Plantower PMS 1003 sensor provided reliable PM data outputs over a 13month period. 10 Our study extended this time period to over 2 years of reliable output from Plantower PM sensors (albeit a different model), although the reproducibility of the calibration coe cients varied by individual units over time. One of the largest programs of low-cost sensor use is currently underway with the U.S. EPA's AirNow network of low-cost PurpleAir sensors for the nationwide monitoring of wild re-generated PM (https://www.airnow.gov/ res/using-airnow-during-wild res/). As demonstrated in our study of Airbeam sensors, the PurpleAir sensors report PM levels that differ from more expensive and reliable monitoring instruments, but these offsets can be corrected by a 'correction equation.' The underlying design of the PurpleAir device is based on the fact that low cost sensors may degrade over time and therefore the PurpleAir device evaluates individual sensor degradation by continually comparing the output of two low-cost Plantower PM sensor units built into each monitoring device. 26 As such, the EPA has published guidelines on the use and performance testing of low-cost air pollution sensors (https://cfpub.epa.gov/si/si_public_record_Report.cfm?dirEntryId=350785&Lab=CEMM). Without such corrections, caution is necessary regarding the reliability of low-cost PM sensors over time.
There were several limitations to our research. Overall, the PM output of each low-cost particle sensor differed from the PM output of the widely used PDR-1500 which has an air ow regulation and infra-red laser that are far more precise than what is available in the low-cost PM sensors, suggesting a potential for under-or over-estimation of PM levels when calibration methods are not utilized. Over time, we experienced equipment failures in a signi cant number of sensors, particularly the Airbeam 1 generation, thus reducing our effective sample size in this calibration study. There were also a number of strengths to our research. Our study provides a robust assessment of the utility of low-cost particle sensors among a large number of a single brand of two generations of particle sensors available for purchase and utilized in citizen science across the U.S. 27 We compared the robustness of these two low-cost Airbeam particle sensor types, as well as across two different calibration particle types. We restricted our analysis to sensors that did not provide evidence of malfunction over time, and measured calibration coe cients over a 2-year period, allowing for the assessment of the reliability of these particles for air quality monitoring.

Conclusions
We observed modest changes in calibration coe cient measurements over a 2-year timeframe among both low-cost Airbeam particle sensor types, but in general the later generation Airbeam 2 model was more reliable, suggesting that speci c particle sensors may yield better longitudinal consistency. While we did observe a degree of inter-monitor variability, changes in calibration coe cient measurements were relatively consistent across Airbeam 1 and 2 sensors. Finally, we did not observe a signi cant difference in calibration coe cients when using cigarette smoke and subway PM as the calibration PM. As noted by our results and that of other researchers, low-cost PM sensors can provide reliable and consistent air quality data but regular calibration of the monitors is necessary to optimize their utility.

Declarations
The study protocol and procedures stemming from the R01CA220591 grant award were approved by the Institutional Review Board at the New York University School of Medicine on July 20, 2017; IRB number: S17-0968. In this current study stemming from our larger R01 grant, we did not involve human research subjects.