Estimating Hourly Full-Coverage PM2.5 Concentrations Based On MODIS Data Over The Northeast of Thailand.

Particulate matter (PM2.5) pollutants are a signicant health issue with impacts on human health; however, monitoring of PM2.5 is very limited in developing countries. Satellite remote sensing can expand spatial coverage, potentially enhancing our ability in a specic area for estimating PM2.5; however, some have reported poor predictive performance. An innovative combination of MODIS AOD was developed to fulll all missing aerosol optical depth (AOD) data obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS). Therefore, hourly PM2.5 concentrations were obtained in Northeastern Thailand. A Linear mixed-effects (LME) model was used to predict location-specic hourly PM2.5 levels. Hourly PM2.5 concentrations measured at 20 PM2.5 monitoring sites and 10- fold cross-validation were addressed for model validation. The observed and predicted concentrations suggested that LME obtained from MODIS AOD data and other factors are a potentially useful predictor of hourly PM2.5 concentrations (R2 >0.70), providing more detailed spatial information for local scales studies. Interestingly, PM2.5 along the Mekong River area was observed higher than in the plain area. The nding can infer that the monsoon wind brings polluted air into the province from sources outside the region. The results will be helpful to analyze air pollution-related health studies. model show associations between PM2.5 and other covariates, including RH, WS, and BLH. small negative β-values found for BLH suggest a slightly negative relationship with PM2.5 and RH signicantly differs from one location to another. An additional negative association between NDVI and PM2.5. NDVI

Northeast of Thailand is also facing this problem. This region consists of 20 provinces, Thailand's most signi cant part. Many sections in this region have suffered from severe PM2.5 pollution in recent years. Both pollution areas and levels are increasing gradually. Therefore, PM2.5 data is required for assessing the impacts of air quality on human health. However, PM2.5 observations lack adequate ground-based monitoring due to budget limitations. As satellite remote sensing can provide data with extensive spatial coverage, satellite-derived Aerosol optical depth (AOD) data can be used to derive surface PM2.5 concentrations. PM2.5 can be obtained from the association of AOD-PM2. 5 5 have only been carried out in a few areas in Thailand (Kanabkaew 2013;Tsai et al. 2000). So far, very little attention has been paid to the role of estimating PM2.5 using satellite remote sensing by MODIS Terra and Aqua covering Thailand. Additionally, the limitation of cloud cover leads to missing AOD data, mainly in the rainy season.
Therefore, this study focused on an innovative combination of MODIS AODs data for lling missing data. Satellite AODs data were retrieved using the MODIS AODs Terra and Aqua Daily Level-3 Data 1°x1° resolution products (MOD20 for Terra and MYD20 for Aqua) with MOD04 L2 and MOD08 M3 (https://ladsweb.modaps.eosdis.nasa.gov/search/order/1) for ful lling all missing values. Therefore, hourly surface PM2.5 concentrations using satellite based AOD data must be su cient and continuous for prediction. Previous research has established the quantitative association between satellite AOD and ground based PM2.5 using the LME model (Mhawish et  Recently, LME has widely been used to establish the PM2.5-AOD relationship with other meteorological factors. Therefore, the LME model was applied to satellite AODs with climate variables and other parameters for predicting the hourly PM2.5 concentration in a full-coverage spatial distribution in Northeast Thailand. This study is focused on the Northeast of Thailand located on the Khorat Plateau, bordered by the Mekong River (along with Laos-Thailand border) to the north and east, by Cambodia to the southeast, and the Sankamphaeng Range south of Nakhon Ratchasima. It is separated from northern and central Thailand by the Phetchabun Mountains to the west. This region is naturally a high level plain called northeast plateau. Northeastern Thailand covers 167,718 km2 with a population of 21,305,000. To calibrate the MODIS value predicted, AODs data, PM2.5, are sampled at 20 stations, as shown in Figure 1. The calibration was performed independently for each monitoring site. A single hourly AOD-PM2.5 relationship obtained from 20 monitoring stations was established. The predicted hourly PM2.5 concentration from this method was validated independently.

AOD data
The MODIS aerosol product monitors the ambient AOD over the oceans globally and the continents. Daily Level-2 data are produced at the spatial resolution of 0.25, 0.5, and 1.0 km (36 spectral channels between 0.41 and 14.23µm) with a viewing swath of 2330 km. Aerosol products were produced in both ocean and land (Kaufman et al. 1997). MODIS AOD collection 6.1 has been used in this work with a spatial resolution of 1°×1°retrieved (http://disc.sci.gsfc.nasa.gov/giovanni website).

Meteorological factors and land use data
Meteorological factors with horizontal resolutions of 1-30 km were retrieved from the Weather Research and Forecasting (WRF), which is a mesoscale meteorological model developed by the National Centers for Environmental Prediction (NCEP) and the National Center for Atmospheric Research (NCAR). Meteorological factors comprised the relative humidity (RH), temperature (TEM), wind speed (WS), and boundary layer height (BLH). The normalized difference vegetation index (NDVI), with a spatial resolution of 1 km, was downloaded from the EOSDIS website (https://search.earthdata.nasa.gov). All data were processed at 10 km, corresponding to the spatial resolution of the MODIS AOD data.

Ground-based PM2.5 data
An hourly basis of PM2.5 data was collected from the pollution control department and climate change data center of Chiang Mai University. To calibrate the MODIS AODs data, PM2.5 was sampled at 20 stations from 1 January 2020 through 31 December 2020. The calibration was performed independently for each station. A single hourly AOD-PM2.5 relationship was established using all parameters obtained from the 20 monitoring stations. . Each province is separately on the underlying geology, tra c volume, human populations, and elevation (ELEV). Therefore, this study added those parameters to improve the accuracy of PM2.5 estimates in each province. The LME model using MODIS AODs, meteorological factors, ad NDVI can be expressed as Equation (1).
where PM2.5 sd is the ground-based PM2.5 (µg/m 3 ) at sites (s) on an hour of the day (h); β and μ h (an hour of the day -speci c) are the xed and random intercepts, respectively; AOD sh represents the MODIS and MISR AODs at sites on the day; and β 1 and µ 1h (an hour of the day-speci c) are the xed and random slopes of AOD, respectively. Tem sh ( • C), RH sh (%), WS sh (m/s), and BLH sh (m) are the climate variables at sites on an hour of the day; β 2 ∼β 5 and µ 2 ∼µ 5 are the xed slopes and random day-speci c slopes, respectively; NDVI sd (unitless) represents at sites in a month (M) at the site; β 6 and µ 6M are the xed slopes and random month-speci c slope, respectively; and ε represents the residual error.

Model Validation
The 10-fold cross-validation (CV) method was used to test for the over tting of the model. The agreements between the predicted PM2.5 concentrations from the 10-fold CV and the measured PM2.5 concentrations were evaluated using the coe cient (R 2 ), mean prediction error (MPE), and root-mean-square error (RMSE). Relative prediction error (RPE) was also applied. RPE is de ned as the RMSE divided by the observed intake values in the independent datasets and expressed as a percentage. The RPE is calculated as:

Validation of MODIS AOD
The correlation coe cient (R2), mean absolute error (MAE), RMSE, and relative mean bias (RMB) were used to evaluate the accuracy of the MODIS AOD. Figure 2 shows the scatterplots and statistical parameters between the MODIS AOD and AERONET AOD. MODIS the MYD04 AOD extracted from the MODIS o ce products ("AOD_550_Dark_Target_Deep_Blue_Combined" as DT and DB). There are 3 AERONET AOD monitoring networks in the Northeast: Ubon Ratchathani, Nong Khai, and Mukdahan. Nong Khai and Muldahan is a new station that has been operated in 2020 with the presence of missing data in the rainy season. Therefore, AERONET AODs in Ubon Ratchathani were compared with satellite data. The result of AOD is a lower RMSE (0.65-0.67) and MAE (0.52-0.54), respectively, indicating little aerosol estimation uncertainty.

Descriptive statistics
The PM2.5 concentrations in the Northeast of Thailand were measured at the 20 monitoring sites from January 2020 to December 2020. Mean PM2.5 AOD, TEM, RH, WS, BLH, and NDVI values for each site are shown in Table 1.
As shown in Table 1

PM2.5 model
This section developed the LME model using parameters, including AOD, TEM, RH, WS, BLH, and NDVI. TEM, RH, WS, and BHL data for each hour and NVDI data for each day during the month were obtained by linear interpolation. The xed intercept and the slopes (β1∼β5 in Equation (1)) of each predictor for the LME model are shown in Table 2. For most variables (AOD, TEM, RH, and BLH), the results in Table 2 show that their effects on PM2.5 are most signi cant at the α = 0.05 level except WS and NDVI. The LME of intercept and slope (AOD) are also statistically signi cant at most stations, while the random effects of intercept and slope (AOD) vary considerably by day. This supports our hypothesis that parameters in uence the relationship between PM2.5 and run daily. Therefore, it is possible to perform daily calibrations using data from the multiple PM2.  Table 3 presents the site-speci c comparisons between the measured and predicted PM2.5 concentrations in the LME for all spatial sites (mean % Precision = 24%, Range = 15 to 30%). As it can be seen, both the model t result in high R 2 , slopes close to 1 and intercepts close to 0, indicating a good agreement between the measured and predicted concentrations. In the LME, the % precision ranged from 15% (3 µg/m 3 ) in Surin to 30% (10 µg/m 3 ) in Maha Srakham with the mean value of 24% (7 µg/m 3 ). Regarding the measures of R 2 and precision values, our model presented considerably higher R 2 (0.93) and lower Precision (15%, 3µg/m 3 ). Interestingly, PM2.5 along the Mekong River aera was observed generally higher than that in plain area. Overall, the performance of the LME to predict surface level hourly PM2.5 concentrations was improved. These performance tests suggest that the LME can be used to produce concentration data sets reliable for both time-series and cross-sectional health effect studies.

Discussion
All parameters are essential for modeling PM2.5 in Northern Thailand in this study. Understanding speci c associations is especially important, where several cities experience some of the worst air quality in the country. and AOD. However, AOD coverage can vary between provinces and could contribute to differences found in β-values and model performance. In most provinces, the temperature has a signi cant negative relationship with PM2.5 concentrations. Our model results also show associations between PM2.5 and other covariates, including RH, WS, NDVI, and BLH. The small negative β-values found for BLH suggest a slightly negative relationship with PM2.5 concentrations, and RH signi cantly differs from one location to another. An additional negative association was found between NDVI and PM2.5. Since NDVI is described as land in use for agriculture, this parameter may account for a portion of the PM concentrations in the air. Positive associations were shown for WS and BLH. A positive association for WS may be explained that WS brings polluted air into the province from sources outside the region, especially at higher wind speeds. The BLH, often a strong predictor in PM2.5 models. Generally, higher BLH is associated with lower PM2.5 concentrations because of more signi cant vertical mixing. However, the research results are opposite to existing literature due to the impact of BLH on air pollution levels, assuming that the air pollution is not caused by local emissions (Lou et al. 2019;Miao et al. 2019). In addition, it shows that the correlation coe cient between PM2.5 concentration and BLH is relatively low, which can contribute to less importance seen in the model diagnostics.
As mentioned earlier, the winter months experienced the maximum PM2.5 concentrations (Kumharn and Hanprasert 2016; Kumharn et al. 2020). This is likely due to temperature inversion, and the amount of PM2.5 brought into the country from neighboring countries. In addition, the fuel sources during these months are extremely dry, and winds bring higher temperatures (during daytime) and lower humidity to the surrounding atmosphere. Thus, high WS provides a driving force for dust generation and entrainment. Consequently, the occurrence of the high WS may be associated with increased PM concentrations in the winter months due to prolonged re seasons driven by these seasonal winds. In addition, the northeast monsoon brings dry and cold air from China to most areas of Thailand in the winter months, and the monsoon winds blow towards land. As a result, PM2.5 in provinces next to the river was higher than in other areas. According to this data, we can infer that the wind brings polluted air into the province from sources outside the region. Taken together, this study is to develop MODIS data to estimate hourly PM2.5 concentrations in Northern Thailand, providing more information on ground based hourly PM2.5 with potential health risks due to elevated pollutant exposures. Therefore, the association between satellite AOD with other parameters and ground-level PM2.5 concentrations is utilized. The LME model can be applied to generalizable dustprone areas in Thailand. Furthermore, AODs that were missing due to cloud cover or other surface albedo issues were ful lled by MOD04 L2 and MOD08 M3. Our goal was to provide MODSI AOD with other factors to enhance ground hourly PM2.5 estimations over Northeastern Thailand.

Conclusions
Satellite AOD data have been increasingly used for PM2.5 air pollution studies. Remote sensing technologies have a great potential to expand current ground-level PM2.5 monitoring networks. To date, the application of satellite data to health effect studies has been limited primarily due to the insu cient power of AOD to predict PM2.5 and the high frequency of non-retrieval days. In this work, all missing AOD was ful lled, which made it possible to determine the temporal and spatial patterns of PM2.5 in a large study domain comprising northerner Thailand. Finally, satellite technologies will provide more acceptable spatial and temporal resolutions and more accurate retrievals. In addition, the advanced capability of differentiating by aerosol types in satellite technologies will further contribute to health effect studies investigating health implications. Since satellite data are readily available, the PM2.5 model can predict cost-effectively. Our method will help to examine the associations between subject-speci c exposures to PM2. 5   The PM2.5 concentrations in the Northeast of Thailand were measured at the 20 monitoring sites.