Synergistic combination of information from ground observations, geostationary satellite, and air quality modeling towards improved PM2.5 predictability

doi:10.21203/rs.3.rs-2089066/v1

Download PDF

Article

Synergistic combination of information from ground observations, geostationary satellite, and air quality modeling towards improved PM_2.5 predictability

https://doi.org/10.21203/rs.3.rs-2089066/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 23 May, 2023

Read the published version in npj Climate and Atmospheric Science →

You are reading this latest preprint version

Concentrations of ambient particulate matter (such as PM_2.5 and PM₁₀) have come to represent a serious environmental problem worldwide, causing many deaths and economic losses. Because of the detrimental effects of PM_2.5 on human health, many countries and international organizations have developed and operated regional and global short-term PM_2.5 prediction systems. The short-term predictability of PM_2.5 (and PM₁₀) is determined by two main factors: the performance of the air quality model and the accuracy of the initial states. While specifically focusing on the latter factor, this study attempts to demonstrate how information from ‘classical’ ground observation networks, a ‘state-of-the-art’ geostationary (GEO) satellite sensor, and an advanced air quality modeling system can be synergistically combined to improve short-term PM_2.5 predictability over South Korea. Such a synergistic combination of information can effectively overcome the major obstacle of scarcity of information, which frequently occurs in PM_2.5 prediction systems using low Earth orbit (LEO) satellite-borne observations. This study first presents that the scarcity of information is mainly associated with cloud masking, sun-glint effect, and ill-location of satellite-borne data, and it then demonstrates that an advanced air quality modeling system equipped with synergistically-combined information can achieve substantially improved performances, producing enhancements of approximately 10%, 17%, 49%, and 19% in the predictability of PM_2.5 over South Korea in terms of IOA (index of agreement), R (correlation coefficient), MB (mean biases), and HR (hit rate), respectively, compared to PM_2.5 prediction systems using only LEO satellite-derived observations.

Earth and environmental sciences/Climate sciences/Atmospheric science/Atmospheric chemistry

Earth and environmental sciences/Environmental sciences/Environmental chemistry/Environmental monitoring

Earth and environmental sciences/Environmental sciences/Environmental chemistry/Atmospheric chemistry

Ambient particulate matters

Predictability of PM2.5

Air quality forecasts

Geostationary satellite sensor

Data assimilation

Particulate matter with aerodynamic diameters smaller than 2.5 µm (PM_2.5) has come to represent a serious societal issue in South Korea and China, because their ambient concentrations frequently exceed criteria concentrations, particularly during winter and spring seasons ^1,2. Detrimental effects of PM_2.5 on human health have been well recognized. High PM_2.5 can lead to high occurrence rates of stroke, ischemic heart diseases (IHD), chronic obstructive pulmonary diseases (COPD), and lung cancer ^3,4. Because of the human toxicity of high PM_2.5, the South Korean Ministry of Environment (KMoE) has been performing PM_2.5 forecasts since 2014 for the purpose of promptly alerting South Korean citizens of hazy events as well as for preparing national emergency programs for reducing the emissions of air pollutants.

Many countries have developed and operated their own short-term air quality forecasting systems (refer to Supplementary Section 1). Among those operational short-term air quality forecasting systems, the ECMWF (European Centre for Medium-Range Weather Forecasts) in Europe and the NASA (National Aeronautics and Space Administration) Goddard center in USA have implemented global air quality forecasts including PM_2.5 and ozone predictions using the C-IFS (Integrated Forecasting System with Atmospheric Composition) ⁵ and GEOS-Chem (Goddard Earth Observing System with Chemistry) ⁶ models, respectively. Several other organizations are also releasing indices of global and regional air quality as well as PM_2.5 on a daily basis, based on multi-model air quality simulations ⁷. Air quality forecasts are now becoming an important part of our daily life, much like traditional weather forecasts.

Although air quality forecasts provide people with important information on air quality every day, people (particularly those in South Korea) have expressed high degrees of dissatisfaction with the accuracy of PM_2.5 forecasts. Therefore, there is a strong need to improve the predictability of ambient PM_2.5 in South Korean society. In this context, the current study deals with important issues of how to enhance the accuracy of air quality predictions. The focus in terms of air pollution in South Korea and northeast Asia is now on PM_2.5. We therefore will pay particular attention to PM_2.5 in this study. However, we believe that strategies and methods to enhance the accuracy of PM_2.5 predictions can also be applied to other ambient pollutants such as ozone and NO₂.

Throughout this study, several key data and ‘state-of-the-art’ technical elements are synergistically combined. Such key data and technical elements include: (i) aerosol data retrieved from Korean ‘geostationary (GEO) satellite sensor’ with/without the aid of a machine learning technique; (ii) near real-time ground observations collected through a ‘screen crawling technique’; and (iii) outputs from an ‘advanced’ air quality modeling system. The objective of this study is to determine whether a ‘smart combination’ of these technical elements and data could create synergy to improve PM_2.5 predictability over South Korea. All technical elements mentioned above were integrated into the framework of the K_ACheMS v2.0 (Korean Air Chemistry Modeling System version 2.0). The K_ACheMS v2.0 is an air quality modeling system that is currently being developed, primarily to enhance the predictability of PM_2.5 and PM₁₀ in South Korea (for details on K_ACheMS v2.0, refer to the Materials and Methods and Supplementary Section 2). Figure 1 illustrates the domain of interest in this study.

ECMWF CAMS_nrt products over South Korea

First, we start our discussion with CAMS_nrt (Copernicus Atmospheric Modeling and Monitoring Service, near real-time products), which is global air quality forecast data produced by the C-IFS of the ECMWF (for details on the C-IFS model simulations, see Supplementary Section 3) ⁸.

Figure 2a presents daily variations of averaged PM_2.5 over South Korea during the period of KORUS-AQ campaign (Korea–United States Air Quality campaign) carried out between 1 May, 2016 and 10 June, 2016. Figure 2 also shows comparisons between PM_2.5 predictions and PM_2.5 observations. Because PM_2.5 forecasts are usually made on a daily basis, we present the daily variation in Fig. 2. However, hourly comparisons are also provided in Supplementary Fig. 1. PM_2.5 observations are acquired from a ground observation network in South Korea called ‘AIR KOREA’ that is managed by the KMoE. This KORUS-AQ campaign has been well-investigated in terms of meteorological and physico-chemical characteristics ^9,10. Therefore, we selected this campaign period as a time-window for this study.

The CAMS_nrt PM_2.5 in Fig. 2a shows a moderate agreement with observed PM_2.5, with an IOA (Index of Agreement) of 0.51. Here, the IOA was calculated based on hourly PM_2.5 data, not based on daily values. In addition, only IOA was presented in Fig. 2. However, other statistical metrics including errors and biases were also analyzed (refer to Supplementary Section 4). This moderate IOA of 0.51 in Fig. 2a might be affected by two main factors: (i) modeling errors produced by C-IFS model simulations and (ii) inaccuracy in the initial state caused by data assimilation with LEO (Low Earth Orbit) satellite-derived data. First, to evaluate the accuracy of C-IFS model simulations, we compared the predicted concentrations of major PM_2.5 constituents with observed concentrations of sulfate, organic aerosols (OAs), black carbon (BC), and dust at six intensive measurement stations in South Korea. The results of the comparisons are shown in Supplementary Section 5. In general, C-IFS model simulations over-predicted dust and BC concentrations but under-predicted OA concentrations. These inaccuracies contributed to the moderate IOA shown in Fig. 2a.

Second, initial states prepared by data assimilation are also crucial, particularly for short-term (i.e., one or two days) predictions. The CAMS_nrt system employs a 4-D VAR (4-Dimensional Variational) data assimilation method with MODIS (Moderate-resolution Imaging Spectro-radiometer) AOD (Aerosol Optical Depth) ¹¹. This 4-D VAR data assimilation method is a technique that can correct errors of model fields (background fields) with ‘observations (near true value)’. Corrected fields (analysis fields) created by data assimilation are applied to operational C-IFS runs for initial conditions. Observational data for the 4-D VAR assimilation are AOD data retrieved from two MODIS sensors onboard Terra and Aqua satellites.

However, MODIS sensors (or other LEO UV/VIS sensors) have shown two serious limitations in terms of their data availability. First, they cannot produce AOD data over areas where clouds are present. We call this phenomenon ‘cloud masking’. For example, (north)east Asia is highly cloudy compared to other continents, mainly because of both large presences of cloud seeds (atmospheric aerosols) and high humidity. Such large cloud masking tends to lead to a large loss of aerosol data over (north)east Asia. In actuality, the average percentage of MODIS AOD data available during the period of the KORUS-AQ campaign was only ~ 14% over the domain shown in Fig. 1.

Second, the results of monitoring via these two MODIS sensors are prone to be affected by the ‘sun-glint effect’. This is an inherent effect caused by the geometry between LEO satellite sensors and the sun. Because of the sun-glint effect, we sometimes lose many possibly important data, particularly over ocean areas, where no surface PM_2.5 observations are available ¹². This loss of MODIS AOD data due to the sun glint effect is discussed further in Supplementary Section 6.

Collectively, these two limitations in monitoring data from MODIS sensors result in a scarcity of AOD (or observed PM_2.5) data. This scarcity of observed data can prevent data assimilation from effectively correcting model background errors. The moderate IOA of 0.51 shown in Fig. 2a is due to both modeling errors and the scarcity of the observations used in the data assimilation. In addition, Fig. 2a provides a typical example of how difficult accurate short-term PM_2.5 prediction is with our current levels of knowledge and techniques.

Leo Vs. Geo Satellite Sensors

With lessons from CAMS_nrt, we set up a strategy to build up a more accurate short-term PM_2.5 prediction system over South Korea by developing a more advanced air quality modeling system intended to reduce modeling errors and by utilizing better satellite-borne PM_2.5 data to improve initial states ^13,14. Regarding the former, we have developed K_ACheMS v2.0. Regarding the latter, we decided to use AOD products from the ‘state-of-the art’ Korean geostationary satellite sensor named GOCI (Geostationary Ocean Color Imager) instead of the two MODIS sensors. Although the GOCI sensor cannot also produce AOD data over pixels where clouds are present, it can overcome the limitation of the sun-glint effect because it is a GEO satellite sensor.

Figure 2b is produced from K_ACheMS v2.0 without performing data assimilation (i.e., without updated initial conditions). With this effort alone, the IOA jumped up from 0.51 (Fig. 2a) to 0.71 (Fig. 2b), indicating the potential importance of the performance of the air quality modeling system. Figure 2c presents PM_2.5 predictions made by the K_ACheMS v2.0 with 3-D VAR (3-Dimensional Variational) data assimilation using only MODIS AOD. The experiment of Fig. 2c was a mimicking simulation of CAMS_nrt. Surprisingly, the IOA in Fig. 2c was the same as that in Fig. 2b. This was because MODIS AOD data were too sparse to affect the predictability of PM_2.5 in South Korea due to both cloud masking and sun-glint effects. Again, the data coverage of MODIS sensors was only ~ 14%. With this low data coverage, the data assimilation appears to be almost ‘useless’.

This problem of data scarcity can be avoided in part by using GOCI satellite-retrieved AOD. When we used GOCI-retrieved AOD data, the percentage of the AOD data available during the period of the KORUS-AQ campaign over the monitoring domain of the GOCI increased up to 28.5%. Here, we attempted to use two products from the Korean GOCI sensor. First, we directly assimilated the GOCI AOD. Second, we assimilated surface PM_2.5 converted from the GOCI AOD via a machine learning technique named the ‘Random Forest (RF) method’ (for details on this data conversion, refer to Supplementary Section 7) ¹⁵. Figure 2d, e present the PM_2.5 predictability results from these two experiments. Enhancements in PM_2.5 predictability can be seen visually in Fig. 2 as well as in terms of IOA. The IOA again increased from 0.71 to 0.73–0.75, showing that assimilation of GOCI AOD data could more effectively correct model errors in the initial state than MODIS AOD data.

Why did RF-converted PM_2.5 correct modeling errors more effectively than GOCI AOD? One reason might be that the RSDs (Relative Standard Deviations) of the errors of RF-converted PM_2.5 tended to be smaller than those of GOCI AODs. This will be discussed in detail in Supplementary Section 7. These smaller RSDs of errors in RF-converted PM_2.5 were reflected in the observation error covariance matrix during the course of data assimilation, as background fields were corrected more strongly on the basis of PM_2.5 observations. Based on these results, we can conclude that applying a machine learning technique to PM_2.5 predictions could create a positive space to further enhance PM_2.5 predictability.

Ill-location Of Information: Satellite Data Vs. Ground Observations

Despite the substantial advancements that have been made, both MODIS- and GOCI-retrieved data have inherent and unavoidable disadvantages: ‘data scarcity’ and ‘data ill-location’. As mentioned above, the average percentage of the GOCI AOD data available over the entire GOCI domain was approximately 28.5%. This means that we could not obtain AOD data from 71.5% of GOCI pixels, mainly due to the presence of clouds. The presence of clouds is the major reason for the problem of scarcity of satellite-borne aerosol data. In addition, even 28.5% of the GOCI aerosol data are not always available at useful locations for PM_2.5 predictions in South Korea. For example, if some portions of GOCI AOD data are available over areas where air masses cannot influence air quality in South Korea (e.g., over the northern edges of the modeling domain or over the East China Sea, which is far from the Korean peninsula), then such AOD data become meaningless in terms of their ability to improve PM_2.5 predictability in South Korea. We refer to this type of problem as the ‘ill-location problem’ of satellite data.

To overcome the problems of data scarcity and data ill-location, we must return back to ‘classical’ observation data, i.e., ground observations. Because ground observations are being made at fixed ‘surface’ locations, these data are never affected by the presence of clouds. Figure 1 shows PM_2.5 measurement networks in China (at approximately 1800 locations) and in South Korea (at approximately 400 sites) ^16,17. However, a challenging point is whether we can obtain these observation data in near real-time. In the case of ‘past’ observations, they can be downloaded from the data archive. However, it is difficult to collect ‘present’ data in a near real-time mode. To resolve this problem, we decided to develop a method called the ‘screen crawling technique’. Using this digital software technique, we can scan and obtain observation data from corresponding Chinese and Korean websites in a near real-time (in situ) mode. Such near real-time observations can then be utilized almost immediately in the PM_2.5 prediction system after a quick data quality inspection called the ‘buddy test’, which was described in detail in Lee et al. ¹⁸.

Figure 2f shows how large improvements in PM_2.5 predictions over South Korea can be made by assimilating ground PM_2.5 observations instead of GOCI AODs or RF-converted PM_2.5. The IOA increased from 0.73–0.75 to 0.77. From the data assimilation of these ground observations, we also found additional advantages. In Fig. 2, the gray-shadow period I was characterized as an ‘Asian dust episode’ that had taken place from 6 May to 7 May, 2016. During this period, dust plumes were generated in the Inner Mongolia, and they were then transported long distance over northeastern China and the Yellow Sea. Unfortunately, these dust plumes were not detected by the GOCI sensor because such dust plumes are frequently co-present with cold frontal clouds. However, the ground observation network inside China certainly detected these dust plumes. During dust episodes, PM₁₀ exhibits a tendency to increase, because dust particles are predominantly composed of coarse-mode particles (i.e., particles larger than 2.5 µm). In this study, we assimilated both ground PM_2.5 and PM₁₀ (refer to Supplementary Section 8). Supplementary Fig. 2f clearly shows that our prediction can also capture the dust peak in PM₁₀ during period I. By contrast, PM₁₀ predictions in Supplementary Fig. 2d, e, wherein only GOCI products were used, could not capture the dust peak. This again demonstrates why we should assimilate ‘ground observations’ of PM_2.5 and PM₁₀ to enhance PM_2.5 and PM₁₀ predictability.

What if Ground And Satellite-derived Observations Are Used Together?

What can happen if ground PM_2.5 observations and GOCI-borne data are used together for data assimilation? Can we expect a synergism? To answer this question, we designed two more experiments: (i) sequential assimilation with ground PM_2.5 and GOCI AOD, and (ii) sequential assimilation with ground PM_2.5 and RF-converted PM_2.5. Figure 2g, h show the results from these two case studies, respectively. In both experiments, IOA increased further from 0.77 to 0.78 over the entire period of the KORUS-AQ campaign, implying that the addition of either GOCI-derived AOD or RF-converted PM_2.5 into ground PM_2.5 could improve the accuracy of PM_2.5 predictions in South Korea.

This appears to be particularly true for the two gray-shadow periods II and III in Fig. 2. Gray-shadow periods II and III were characterized by KORUS-AQ scientists as ‘a period of air stagnation’ and ‘an episode of long-range transport’ from China, respectively. During the period of air stagnation (between 18 May and 23 May, 2016), an anticyclone sat around the Korean peninsula. Because of this, air masses were rotating clockwise with low wind speeds around the Korea peninsula (see Fig. 3b). This low-speed rotation of air masses created a favorable condition for air pollutants to be accumulated around the Korean peninsula. On the other hand, during the period of long-range transport (between 25 May and 30 May, 2016), air pollutants, including PM_2.5, were transported long distance from the North China Plain (NCP) to South Korea due to strong westerlies (see Fig. 3c). We found that PM_2.5 increased up to 60 µg/m³ on 26 May, 2016. As can be seen visually in Fig. 2g, h, the gaps between PM_2.5 observations and predictions during these two gray periods became narrower than those in Fig. 2f. A similar situation was also found for the predictions of PM₁₀, as shown in Supplementary Fig. 2g, h. These results are particularly important, because both air stagnation and long-range transport are two typical meteorological conditions under which the levels of PM_2.5 in the Korean peninsula are frequently enhanced. If the situation is really like this one, then the following question arises: what factor creates such improvements in PM_2.5 and PM₁₀ predictability?

The answer to this question is presented in Fig. 3, which demonstrates a synergism created by sequential data assimilation using ground PM_2.5 observations and GOCI-derived AOD. During periods II and III, sequential applications of both forms of data to data assimilation allowed IOA to jump up from 0.57 to 0.64 during the period of air stagnation and from 0.62 to 0.67 during the period of long-range transport. In case of air stagnation, GOCI AOD data were available over the Yellow Sea (denoted as I in Fig. 1), North Korea (denoted as II in Fig. 1), and the East Sea (denoted as III in Fig. 1). This data availability was due to the fact that the sky was very clear (uncloudy) as a result of the anticyclone located around the Korean peninsula. Because of the rotation of air masses, all the three regions became upwind regions to South Korea in this case. Therefore, it is crucial to have information over all three of these regions for PM_2.5 predictions in South Korea.

The episode of long-range transport (period III) is another excellent example regarding the creation of synergism. Surprisingly, during this period, ground PM_2.5 measurements inside China were all turned off. However, high AOD plumes were detected by the GOCI sensor over the Yellow Sea. The Yellow Sea was an upwind region to South Korea in this case (see the arrows of winds in Fig. 3c). In addition to the ground PM_2.5 information available inside South Korea, the GOCI AOD data available over the Yellow Sea helped us further enhance the accuracy of PM_2.5 predictions in South Korea.

Over the entire period of the KORUS-AQ campaign, including the two episodes discussed above, our advanced PM_2.5 prediction system (Fig. 2h) exhibited enhancements of approximately 10%, 17%, 49%, and 19% in the predictability of PM_2.5 over South Korea, in terms of IOA, R (Pearson correlation coefficient), MB (mean biases), and HR (hit rate), respectively, compared to the PM_2.5 prediction system using LEO-retrieved observations alone (Fig. 2c) (for details, also refer to Supplementary Fig. 3 and Section 4).

Blank Area Of Information

As described above, South Korea is surrounded by so-called ‘blank areas of information’ such as the Yellow Sea, North Korea, and the East Sea. However, transboundary air pollution events from China to South Korea are almost always occurring through these blind regions along the strong persistent westerly and/or northwesterly winds. This means that, for improved PM_2.5 (and PM₁₀) predictability in South Korea, it is certainly necessary to have information on air quality over these blind regions.

In this context, we demonstrated that information from GEO satellite sensors over those blank areas of information could help us substantially improve PM_2.5 predictability in South Korea. Although it is true that geostationary satellite data can help us improve PM predictability, there is no 100% guarantee that GEO satellite data are always available over the blank areas due to the random presence of clouds. To provide more convincing data availability over the blank regions, it would be helpful to establish surface observation networks to boost the performances of PM_2.5 predictions in South Korea. The establishment of a surface observation network may be a cheaper and more guaranteed way to improve PM_2.5 predictability in South Korea than launching ‘expensive’ GEO satellite sensors.

Based on the discussions shown above, establishing several air quality monitoring towers or stations above the Yellow Sea and the East Sea (Sea of Japan) as well as inside North Korea will be highly useful for achieving better PM_2.5 predictions in South Korea. Of course, building air monitoring stations inside the territory of North Korea would create political arguments. It appears to be a difficult task to achieve politically, although it is an easy task to accomplish technically. In such a case, good diplomatic or political efforts will certainly be able to lead to improved air quality predictions and management in South Korea.

Additional Discussions

In this study, we reported our major findings inferred from experiments during the period of the KORUS-AQ campaign. To more firmly support our conclusions, we also carried out another six-month test-bed experiment between 1 November, 2016 and 30 April, 2017 (a high PM_2.5 period in South Korea). Test results from this six-month investigation are discussed in Supplementary Section 9. In short, we again drew the same conclusions from the six-month experiment. Based on this finding, we now believe that our conclusions drawn from this study are general ones, not period-specific. The methods and strategies applied to South Korea can also be applied to many other regions that encounter similar situations. In this sense, our methods and strategies are not area-specific, either.

In addition, IOAs and other statistical metrics such as root mean square errors, mean biases, and so on are commonly used in the science community. We refer to these metrics as ‘scientific statistical metrics’. On the other hand, the HR is a metric that administration parties such as KMoE are taking particular care of in South Korea. Therefore, the HR is referred to as an ‘administrative statistical metric’. In many cases, high IOAs do not always guarantee high HRs, because HR refers to the percentage of successful hitting of category intervals of PM_2.5 in South Korean air quality standards, unlike IOAs. For details on HR, refer to Supplementary Section 10. Although the HR is not a scientific metric, there has also been a strong requirement to enhance the HR in South Korea. This was why we also inserted HRs into Fig. 2. Further, it could be seen that HRs increased continuously from Fig. 2b–h.

The K_ACheMS v2.0 discussed in this study has been actively run in a test mode for PM_2.5 predictions over South Korea since 1 Jan., 2022 (visit website at https://kachems.gist.ac.kr). Figure 4 presents the parts of performances of the K_ACheMS v2.0 through comparisons between PM_2.5 observations and PM_2.5 predictions. In Fig. 4, three PM_2.5 predictions from the K_ACheMS v2.0 and current operational PM_2.5 prediction systems at NIER (National Institute of Environmental Research in South Korea) and ECMWF are compared with PM_2.5 observations obtained from AIR KOREA. The PM_2.5 predictions from the NIER have been based on a different approach of an ensemble average of 20 outputs from 20 combinations among different meteorology and air quality models and emission inventories ¹⁹. It can be seen in Fig. 4 that the K_ACheMS v2.0 produced much better PM_2.5 predictions than the other two systems at NIER and ECMWF for three major high PM_2.5 episodes. The three high PM_2.5 episodes shown in Fig. 4 were the ‘biggest’ ones that occurred over the six months, during which the K_ACheMS v2.0 were operated in a test mode (between 1 Jan. and 30 June, 2022).

During the six-month test period, the IOA and HR from the K_ACheMS v2.0 were 0.78 and 67%, respectively. These numbers were substantially higher than those produced by the ECMWF prediction system (IOA: 0.76 and HR: 54%). Although the PM_2.5 predictions from the NIER system were not released yet, annual averaged HR of the NIER system has been reported to be approximately 55%. Further analyses are made in Supplementary Section 11 (refer to Supplementary Tables 7 and 8). Based on these analyses, the new PM_2.5 prediction system proposed in this study appears to have been working very well for the first six-month test period, although a comprehensive performance report on the one year-long test is scheduled to be issued at the end of 2022 including the PM_2.5 predictions from the NIER system.

When developing the K_ACheMS v2.0, we have also put some parallel efforts into developing machine learning techniques to correct errors and biases resulting from air quality model simulations. We call this process ‘error and bias correction via machine learning’. For this process, we have selected the machine learning technique of ‘XGBoost (eXteme Gradient Boost)’ ²⁰. We find that this technique has been particularly effective in terms of enhancing ‘HRs’ without showing great advantages in improving IOAs. This will also be discussed further in Supplementary Section 12.

K_ACheMS v2.0 consists of the following: (i) Chemistry-transport model (CMAQ vG, a CMAQ model significantly modified by Gwangju Institute of Science and Technology in South Korea); (ii) Meteorological models including the WRF-ARW (Advanced Research version of the Weather and Research Forecasting) model and UK Met Office UM (Unified Model), together with WRF-MCIP (Meteorology-Chemistry Interface Processor) and UM-MCIP; (iii) bottom-up emissions based on KORUS 5.0; (iv) data assimilation system based on 3-D VAR and EnKF (Ensemble Kalman Filter) methods using ground observations and/or geostationary GOCI; and (v) machine learning elements such as XGBoost. Each element of the K_ACheMS v2.0 is described further in Supplementary Section 2.

The CMAQ vG (Community Multi-scale Air Quality version G) model has been actively developed based on the US EPA CMAQ v5.2 model. In particular, the model has been improved through the following additions: (i) daytime HONO photo-chemistries ²¹, (ii) heterogeneous HO₂ reactions ²², (iii) gas- and aqueous-phase halogen chemistries ²³, (iv) new yield data for the formation of secondary organic aerosols acquired from multiple smog chamber experiments conducted under typical conditions of northeast Asia ²⁴, and (v) new convection (Kain-Fritsch) and advection (Lagrangian Trajectory-Grid) schemes ^25,26

For anthropogenic emissions, KORUS v5.0 has been used. KORUS v5.0 is a bottom-up emission inventory that was specifically established for the KORUS-AQ campaign ²⁷. K_ACheMS v2.0 uses two meteorological fields generated by WRF-ARW model ²⁸ and UM ²⁹. This study used the meteorological fields generated via WRF-ARW model using 0.25°×0.25°-resolved NOAA GFS MET fields for initial and boundary conditions.

Two data assimilation techniques have been developed and incorporated into K_ACheMS v2.0: 3-D VAR and EnKF methods. In this study, a 3-D VAR technique was assigned to use both/either ground observations and/or satellite-derived information over northeast Asia. To use the 3-D VAR method, we developed background and observation error covariance matrices. To update the initial conditions, data assimilations were carried out at every 00 UTC with satellite-borne and/or ground observations. Further technical details regarding the 3-D VAR method can be found in Supplementary Section 13.

Data Availability

The data are available upon request to the corresponding author.

Acknowledgments

This research was supported by the National Research Foundation of Korea (NRF) grant funded by the Ministry of Science and ICT (MSIT) (No. 2020M3G1A1114617 and No. 2021R1A2C1006660).

Author Contributions

J.Y., D.L., K.M.H., and S.-Y.P. carried out K_ACheMS model simulations and data analysis; S.L., P.E.S., and G.R.C established data assimilation technique; H.S.K., M.J., S.P., J.I., and J.K. developed machine learning techniques; V.-H.P. provided ECMWF CAMS data; J.K. and C.-K.S. developed GOCI AOD retrieval algorithm; J.-H.W. prepared emissions; S.-H.R. developed screen crawling technique; C.H.S. designed and supervised research and wrote the paper.

Competing Interests

The authors declare no competing interests.

Kim, H. C. et al. Recent increase of surface particulate matter concentrations in the Seoul Metropolitan Area, Korea. Sci Rep 7, 4710 (2017).
Huang, R.-J. et al. High secondary aerosol contribution to particulate pollution during haze events in China. Nature 514, 218–222 (2014).
Apte, J. S., Marshall, J. D., Cohen, A. J. & Brauer, M. Addressing Global Mortality from Ambient PM_2.5. Environ. Sci. Technol. 49, 8057–8066 (2015).
Burnett, R. T. et al. An Integrated Risk Function for Estimating the Global Burden of Disease Attributable to Ambient Fine Particulate Matter Exposure. Environmental Health Perspectives 122, 397–403 (2014).
Flemming, J. et al. Tropospheric chemistry in the Integrated Forecasting System of ECMWF. Geoscientific Model Development 8, 975–1003 (2015).
Keller, C. A. et al. Description of the NASA GEOS Composition Forecast Modeling System GEOS-CF v1.0. Journal of Advances in Modeling Earth Systems 13, e2020MS002413 (2021).
For example, to find a global and regional multi-model air quality forecasts, go to http://waqi.info/forecast/#/ (August 1, 2022).
https://apps.ecmwf.int/datasets/data/cams-nrealtime (August 1, 2022).
Peterson, D. A. et al. Meteorology influencing springtime air quality, pollution transport, and visibility in Korea. Elementa: Science of the Anthropocene 7, (2019).
Kim, H., Zhang, Q. & Heo, J. Influence of intense secondary aerosol formation and long-range transport on aerosol chemistry and properties in the Seoul Metropolitan Area during spring time: results from KORUS-AQ. Atmospheric Chemistry and Physics 18, 7149–7168 (2018).
Benedetti, A. et al. Aerosol analysis and forecast in the European Centre for Medium-Range Weather Forecasts Integrated Forecast System: 2. Data assimilation. Journal of Geophysical Research: Atmospheres 114, (2009).
Cox, C. & Munk, W. Measurement of the Roughness of the Sea Surface from Photographs of the Sun’s Glitter. J. Opt. Soc. Am., JOSA 44, 838–850 (1954).
Lee, S. et al. GIST-PM-Asia v1: development of a numerical system to improve particulate matter forecasts in South Korea using geostationary satellite-retrieved aerosol optical data over Northeast Asia. Geoscientific Model Development 9, 17–39 (2016).
Lee, K. et al. Development of Korean Air Quality Prediction System version 1 (KAQPS v1) with focuses on practical issues. Geoscientific Model Development 13, 1055–1073 (2020).
Park, S. et al. Estimation of ground-level particulate matter concentrations through the synergistic use of satellite observations and process-based models over South Korea. Atmospheric Chemistry and Physics 19, 1097–1113 (2019).
Go to http://www.cnemc.cn/en for China urban air quality real-time data release platform (August 1, 2022).
Go to https://www.airkorea.or.kr for AIR KOREA (August 1, 2022).
Lee, S. et al. Impacts of uncertainties in emissions on aerosol data assimilation and short-term PM_2.5 predictions over Northeast Asia. Atmospheric Environment 271, 118921 (2022).
Chang, L.-S. et al. Human-model hybrid Korean air quality forecasting system. J Air Waste Manag Assoc 66, 896–911 (2016).
Ivatt, P. D. & Evans, M. J. Improving the prediction of an atmospheric chemistry transport model using gradient-boosted regression trees. Atmospheric Chemistry and Physics 20, 8063–8082 (2020).
Zhang, L. et al. Potential sources of nitrous acid (HONO) and their impacts on ozone: A WRF-Chem study in a polluted subtropical region. Journal of Geophysical Research: Atmospheres 121, 3645–3662 (2016).
Macintyre, H. L. & Evans, M. J. Parameterisation and impact of aerosol uptake of HO₂ on a global tropospheric model. Atmospheric Chemistry and Physics 11, 10965–10974 (2011).
Sarwar, G. et al. Impact of Enhanced Ozone Deposition and Halogen Chemistry on Tropospheric Ozone over the Northern Hemisphere. Environ. Sci. Technol. 49, 9203–9211 (2015).
Babar, Z. B., Park, J.-H., Kang, J. & Lim, H. Characterization of a Smog Chamber for Studying Formation and Physicochemical Properties of Secondary Organic Aerosol. Aerosol Air Qual. Res. 16, 3102–3113 (2016).
Pouyaei, A. et al. Development and Implementation of a Physics-Based Convective Mixing Scheme in the Community Multiscale Air Quality Modeling Framework. Journal of Advances in Modeling Earth Systems 13, e2021MS002475 (2021).
Pouyaei, A., Choi, Y., Jung, J., Sadeghi, B. & Song, C. H. Concentration Trajectory Route of Air pollution with an Integrated Lagrangian model (C-TRAIL Model v1.0) derived from the Community Multiscale Air Quality Model (CMAQ Model v5.2). Geoscientific Model Development 13, 3489–3505 (2020).
Woo, J.-H. et al. Development of the CREATE Inventory in Support of Integrated Climate and Air Quality Modeling for Asia. Sustainability 12, 7930 (2020).
Skamarock, C. et al. A Description of the Advanced Research WRF Model Version 4.1. (2019) doi:10.5065/1dfh-6p97.
Brown, A. et al. Unified Modeling and Prediction of Weather and Climate: A 25-Year Journey. Bulletin of the American Meteorological Society 93, 1865–1877 (2012).

(Not answered)

supplementaryinformationnpjclimatsci.docx

Download PDF

Journal Publication

published 23 May, 2023

Read the published version in npj Climate and Atmospheric Science →

Editorial decision: revise
01 Dec, 2022
Review #1 received at journal
30 Nov, 2022
Review #2 received at journal
23 Nov, 2022
Review #3 received at journal
08 Nov, 2022
Reviewer #3 agreed at journal
08 Nov, 2022
Reviewer #2 agreed at journal
04 Nov, 2022
Reviewer #1 agreed at journal
03 Nov, 2022
Reviewers invited by journal
03 Nov, 2022
Editor assigned by journal
03 Nov, 2022
Submission checks completed at journal
13 Oct, 2022
Unknown event
07 Oct, 2022
First submitted to journal
05 Oct, 2022

You are reading this latest preprint version

Synergistic combination of information from ground observations, geostationary satellite, and air quality modeling towards improved PM_2.5 predictability

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results And Discussion

ECMWF CAMS_nrt products over South Korea

Leo Vs. Geo Satellite Sensors

Ill-location Of Information: Satellite Data Vs. Ground Observations

What if Ground And Satellite-derived Observations Are Used Together?

Blank Area Of Information

Additional Discussions

Materials And Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1