Missing or Underrated Super-emitters of Nitrogen Oxides in China

Yuqing Pan, Lei Duan, Mingqi Li, Pinqing Song, Nan Xv, Jing Liu, Shaozhuo Li, Yifei Le, Cui 3 Wang, Jingjun Ma, Xin Zhou, Wei Wang, Yali Shi, Wei Su, Gang Wang, Liqiang Wang, Xue 4 Chen, Yan Xia, Linhui Jiang, Yibo Zhang, Mengying Li, Zhen Li, Weiping Liu, Shaocai Yu, 5 Daniel Rosenfeld, John H. Seinfeld, Pengfei Li 6 College of Science and Technology, Hebei Agricultural University, Baoding, Hebei 071000, P.R. China 7 College of Life Science, Zhejiang Chinese Medical University, Hangzhou 310053, Zhejiang, P.R. China 8 Research Center for Air Pollution and Health; Key Laboratory of Environmental Remediation and Ecological Health, 9 Ministry of Education, College of Environment and Resource Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, 10 P.R. China 11 Institute of Earth Science, The Hebrew University of Jerusalem, Jerusalem, Israel 12 Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA 13 14 15 Correspondence to: Pengfei Li (lpf_zju@163.com); 16 Cui Wang (wangcui198506@163.com); 17 Shaocai Yu (shaocaiyu@zju.edu.cn). 18 19


Introduction 45
Nitrogen oxides (NOx ≡ NO + NO2) play a central role in the formation of fine particular matter and ozone and have 46 implications for climate change, human health, and life expectancy 1-4 . They are typically emitted by all combustion 47 processes, particularly industrial activities (e.g., power or chemical plants) [5][6][7][8] . China is a case in point. There is a heavy-tail 48 distribution of NOx emission sources in a number of localized regions 9,10 , where a few sources (so-called super-emitters) 49 generally emit highly concentrated NOx plumes and might even dominate localized emissions with limited geographical 50 extent (i.e., ~ 1 × 1 km 2 ). Hence, unique opportunities for mitigation are presented by super-emitters, particularly by those 51 due to leaks and abnormal operating conditions 11,12 . 52 However, efforts to guide super-emitter mitigation are complicated by large inconsistencies between emission estimates 13-16 . 53 This is primarily because these inventories are generally designed at the regional scale rather than localized super-emitters. 54 For instance, the foundations of bottom-up inventories, such as activity data and emission factors, are often outdated, sparse, 55 and unrepresentative for super-emitters 5,7,17, 18 . By comparison, top-down attempts relying on relatively accurate and up-to-56 date measurements present a more promising future 6 . Nevertheless, there is a dearth of available regular measurements for 57 the super-emitters 19 . Field campaigns are also spatially sparse and temporally infrequent, thus inapplicable for the super-58 emitters distributed over a large scale. 59 Submitted to Nature Communication 3 For decades, satellite measurements have provided spatial patterns and magnitude of tropospheric NO2 vertical column 60 densities (VCDs) on a global scale, central to improving our knowledge of the NOx emission budget 1,6,20 . However, the 61 ability to detect super-emitters has been limited by pixels far larger than 1 × 1 km 2 21 . Finally, the emissions attributed to the 62 NOx super-emitters have not been well isolated and evaluated nationwide. 63 The TROPOspheric Monitoring Instrument (TROPOMI) on the Sentinel-5 Precursor has an unprecedented spatial resolution 64 of up to 3.5 × 7 km 2 (3.5 × 5.5 km 2 from August 2019 onward) and a high signal-to-noise ratio 7,13 . A representative top-65 down method using a CTM and a Kalman Filter has tested the TROPOMI measurements at a sacrifice of spatial resolution 66 (i.e., ~ 0.25° × 0.25°) 22 , consequently focusing only on regional issues. A recent study preserved the benefit of its high 67 spatial resolution and experimentally exposed a few large emission sources on a national scale 7 . However, the original 68 TROPOMI's resolution (3.5 × 5.5 km 2 ) still has a crucial gap in the scale of the localized super-emitters (~ 1 × 1 km 2 ). 69 Here we develop an efficient, super-resolution (i.e., from 3.5 × 5.5 km 2 to 1 × 1 km 2 ) inverse model by capitalizing on a 70 whole year of daily TROPOMI measurements (Methods). The retrievals of the NOx VCDs are supported by a state-of-the-art 71 CTM (i.e., the WRF-CMAQ model). To prevent the super-emitters from being omitted, a key advance of this model is to 72 take into account the nonlinear NOx VCD-Transport-Lifetime-Emission relationships and to integrate an oversampling 73 method 7,14 . The objective is to provide the first systematic survey of the NOx emission distributions at the super-resolution, 74 with a focus on geo-locating and quantifying the super-emitters nationwide, down to the industrial hotspots or parks. Our 75 survey is used to benchmark a state-of-the-art emission inventory MEICv1.3 (the Multi-resolution Emission Inventory for 76 China) and also a state-of-the-art top-down product. This will help fill an important gap in scale and re-evaluate the NOx 77 budget and hotspots in China. 78

Identification of super-emitters 79
An oversampling approach was applied to exploit the variable spatial coverage of the satellite pixels (Methods). As a result, 80 we achieved the super-resolution (i.e., from 3.5 × 5.5 km 2 to 1 × 1 km 2 ) tropospheric NOx VCDs based on the TROPOMI 81 measurements in China (Fig. 1a). We also zoomed in on five sub-regions (including Northwest, Northeast, North, Southwest, 82 East, and South China) (Figs. 1b ~ 1g). On this basis, the consequent NOx emission map was derived by an efficient, 83 compatible top-down inverse model (Fig. S1). In theory, this set of maps ( Fig. 1 and Fig. S1) is distinct from previous 84 outcomes obtained from early satellite-based surveys (e.g., those based on the OMI or TROPOMI measurements) 13,14,23 . This 85 is solely attributed to major improvements in the model developed here, which is of hyperfine resolution and takes 86 meteorological and chemical effects into account (Methods; Supplementary Information). An illustrative comparison is 87 explained in the next section. We isolated and identified 1625 hotspots (Fig. 1) that are inventoried in Supplementary Table  88 1 only if they could be resolved unambiguously on the basis of the maps alone without the need for a priori knowledge. Each 89 exhibited a prominent localized (i.e., 1 × 1 km 2 ) NOx enhancement regarding both VCDs and emissions. 90 Submitted to Nature Communication 4 Spatially, we found that these super-emitters are scattered on every corner in China, even in remote zones close to the 91 frontiers (e.g., Northwest and Southwest China) (Fig. 1). As expected, such super-emitters concentrated over North, East, 92 and South China. By combining the Landsat 8 images, the identified super-emitters can be classified into two classes: 93 industrial hotspots and industrial parks. They can be linked to the super-emitters typically containing individual point 94 sources or intensive industrial facilities. Illustrative examples are shown in Fig. 2 and Fig. S3, and detailed spatial 95 information is highlighted in Fig. S2. The 936 hotspots in the class of industrial hotspots were consistently found to be 96 associated with isolated industrial factories or power plants with one or more chimneys, as presented in the visible images. 97 For instance, a typical localized NOx emission maximum was found in a remote desert area in Northwest China (Fig. 2a) The second class, that of industrial parks, is linked to aggregative zones with massive industrial facilities but limited 106 geographical extent (i.e., 1 × 1 km 2 ), for which nearly 700 parks were detected. They were mostly associated with chemistry 107 and manufacturing industries, such as oil and gas, iron and steel, and foundry production. Commonly, rural regions were also found with enhanced NOx VCDs and emissions. These regions correspond to, for 119 example, residential areas and small-scale manufacture clusters, such as Jiqingbao (North China) (Fig. S4a), Pangjing (East 120 China) (Fig. S4b), and Chetian (Southwest China) (Fig. S4c). The primary NOx sources might be attributed to scattered coal 121 combustion 24,25 . Nevertheless, it was difficult to link those sources with clear, well-isolated super-emitters. Also, NOx 122 emissions in cities represent a substantial part of the total atmospheric NOx budget. Megacities, such as Beijing (Fig. S4d), 123 Submitted to Nature Communication 5 Shanghai (Fig. S4e), Guangzhou (Fig. S4f), and Shenzhen (Fig. S4g), are the cases in point. We emphasized that, from a 124 narrower spatial perspective (i.e., ~ 10 × 10 km 2 ), small-medium cities, such as Lhasa (Southwest China) (Fig. S4h), 125 Shizuishan (Northwest China) (Fig. S4i), and Yongan (South China) (Fig. S4j), are no exception. The main sources can be 126 consistently related to urban transportation, which, however, are too diffuse to emerge as individual super-emitters in our 127 maps. In addition, the one-year average of satellite detections was generally difficult to capture discontinuous and 128 instantaneous biomass burning. Thus, hotspots dominated by biomass burning were excluded from this study. 129

Assessment of super-emitters 130
We  Table 1). 132 For 67% of the cities, the emission fluxes agreed within a factor of two (85% within a factor of three) and, importantly, when 133 all large and medium-sized cities were considered, no major bias emerged. Note that, compared to our results, the inventory 134 often underestimated the NOx emissions in remote small-sized cities significantly (by more than a factor of five) 135 (Supplementary Table 1). Therein some isolated super-emitters (i.e., industrial hotspots or parks), such as Hejing (Northwest 136  S5l) were captured by the one-year satellite observations. They represented the localized (i.e., 1 ~ 10 km) maximum but 154 were undoubtedly missed in the inventory.
Besides, the satellite-based study could capture the onset or the discontinuation of industrial activities unambiguously. New 156 or expanded super-emitters that emerged within the satellite measurements were found in this way. Correspondingly, the 157 satellite-based emission fluxes were significantly higher (at least one order) than the bottom-up estimations. For instance, 158 high NOx emissions were observed over Xincheng (North China) (Fig. S6a) and Huagang (South China) (Fig. S6b). On the 159 other hand, industrial plant closures were also detected, for example, over Shidian (North China) (Fig. S6c) and over Hunhe 160 (Northeast China) (Fig. S6d). Therein the bottom-up emission inventory was likely outdated, overestimating the satellite-161 based estimates by more than 100%. 162 Figure 4 shows the stable monthly variations (i.e., < 26%) in the satellite-based emission estimates of eight representative 163 super-emitters. The detailed variation information is shown in Fig. S7. Therefore, for most of the super-emitters, one-month 164 satellite overpasses could derive their yearly emission estimates and be capable of tracking their emission variations. 165

Discussion 166
Here we develop an efficient, super-resolution inverse approach by collecting an entire year of daily TROPOMI 167 measurements, applying an oversampling method, and building a NOx VCD-Transport-Lifetime-Emission model. The key 168 for preventing the smearing of the super-emitters is to exploit the super-resolution and wind-driven horizontal fluxes. This is 169 particularly useful for preserving strong gradients close to super-emitters even on top of considerably high pollution. In 170 contrast, current top-down approaches generally apply time-consuming inverse algorithms to satellite measurements, thus 171 sacrificing spatial resolution and focusing on regional issues 29 . We compared our results with a state-of-the-art top-down 172 NOx emission inventory 22 for three representative super-emitters (Fig. S8). Although this inventory also relied on the 173 TROPOMI observations, it adopted another inverse model (i.e., the DESCO algorithm) to explore regional emission 174 variations on a 0.25° × 0.25° resolution. Figures S8 and S9 demonstrate that these two products were very similar in the 175 general spatial distributions and regional magnitude. In contrast, a key advance in this study was the considerable increase in 176 spatial resolution. Of particular relevance are the three super-emitters, which can only be distinguished in our results. 177 This work has presented a detailed and hyperfine inventory of NOx super-emitters over China. They can be consistently 178 linked to either industrial hotspots or parks and responsible for the localized NOx budget. More importantly, their emissions 179 are mostly underestimated, even displaced and missed, in a widely used emission inventory. This work can also capture the 180 emergence or closure of super-emitters in a relatively short time (i.e., monthly). Note that manual efforts regarding 181 distinguishing the super-emitters limit this study. Deep learning algorithms might be a prospective alternative that allows us 182 to rapidly and impartially identify super-emitters 30 . To date, continuous emission monitoring systems (CEMS) remain 183 largely absent for super-emitters, particularly in industrial parks. Our results suggest that it is necessary to revisit traditional 184 bottom-up NOx inventories. Otherwise, it is quite possible to mislead local air pollution distributions in CTMs and thus local 185 efforts for mitigation. By comparison, satellite surveys can make an important contribution to monitoring NOx emissions, 186 particularly beneficial for up-to-date emission inventories for quickly developing countries. Therefore, widespread and 187 Submitted to Nature Communication 7 sustained deployment of a multi-tiered observational strategy, i.e., a combination of this hotspot detection technique with a 188 near-real-time ground-based and another satellite-based monitoring network of regional sources, could greatly advance 189 scientific understanding of NOx budgets. More specifically, our approach, together with more comprehensive bottom-up 190 information, could be expanded to detect abnormal (e.g., leakage) facilities 31  The satellite images come from the Landsat 8 imageries (Fig. S2a ~ S2d). 207 The satellite images come from the Landsat 8 imageries (Fig. S2e ~ S2h). Methods 228

Satellite observations 229
The TROPOMI instrument on the European Space Agency's Sentinel-5P satellite provides daily global coverage of slant 230 column densities (SCDs) of NO2 with an unprecedented spatial resolution of up to 3.5 × 7 km 2 (3.5 × 5.5 km 2 from August 231 2019 onward) 7,13,34 . Its overpass time is close to noon (13:30 local time). The satellite measurements with a high resolution 232 allow us to analyze the finer scale spatiotemporal characteristics of NO2. On this basis, the following calculation of 233 tropospheric NO2 VCDs was very similar to the operational product 35 , for which detailed algorithms can be found in the 234 production discription 36 . However, the difference was associated with the conversion of SCDs to VCDs that relied on the air 235 mass factor (AMF) approach. In the operational product, the a priori NO2 profile was taken from the global chemistry 236 transport Tracer Model 5 (TM5-MP) with a spatial resolution of 1° × 1°. By comparison, we improved this process by using 237 a priori profile from a comparably high resolution (i.e., 5.5 × 5.5 km 2 ) CTM (i.e., the WRF-CMAQ model 37,38 ). Note that 238 measurements with cloud fraction above 30% or a "qa value" (indicating data quality) below 0.75 were skipped 36 . For each 239 pixel, to derive the lower VCD and eliminate the bias of the stratospheric estimate, we subtracted the 5 th percentile within the 240 individual pixel from the total VCD 7 . 241 The WRF-CMAQ model 242 The two-way coupled WRF-CMAQ model (the WRF-CMAQ model) was applied to provide a priori profile and wind fields 243 on the spatial resolution of 5.5 × 5.5 km 2 (comparable to the resolution of the TROPOMI instrument). The results are the 244 foundation of the AMF calculation and the following top-down NOx emission estimates. Detailed model settings can be 245 found in our previous papers 39, 40 . Meteorological initial and boundary conditions were obtained from the European Centre 246 for Medium-range Weather Forecasts (ECMWF) ERA5 reanalysis dataset with the spatial resolution of 1º × 1º and temporal 247 resolution of 6 hours. The analysis nudging option was switched on for temperature, humidity above the PBL, and winds at 248 all model levels, thus being nudged to the meteorological driving data (i.e., the ERA5 reanalysis dataset) 41 . 249 The horizontal domain of the model covered mainland China with a compatible horizontal resolution (5.5 × 5.5 km 2 ) 250 following a Lambert Conformal Conic projection (Fig. 1a). In terms of the vertical configuration, 29 sigma-pressure layers 251 ranged from the surface to the upper-level pressure of 100 hPa, 20 layers of which are located below around 3 km to derive 252 finer meteorological and chemical characteristics within the planetary boundary layer. 253 The anthropogenic emissions were obtained from MEICv1.3 15 , which contained primary species (e.g., primary PM2.5, SO2, 254 NOx, CO, and NH3) from five anthropogenic sectors (i.e., agriculture, power plant, industry, residential, and transportation). 255 This inventory was initially designed with the spatial resolution of 0.25° × 0.25° and thus reallocated to match the domain 256 configuration (i.e., 5.5 × 5.5 km 2 ) in the study.

One-year oversampled tropospheric NO2 VCDs 258
To achieve a super-resolution reconstruction for the one-year tropospheric NO2 VCDs, we applied the oversampling method 259 to convert from the original satellite pixels to the 1 × 1 km 2 grid cells. As demonstrated in previous studies associated with 260 reshaping air pollutant distributions, this technique can exploit the fact that the location, shape, and orientation of the satellite 261 footprints slightly varies from one orbit to another 14,18 . Thus, a much higher resolution can be obtained on the spatial 262 distribution by sacrificing temporal information. This was distinct from the most widely used geometry methods that 263 generally rely on interpolation or binning on a rectangular latitude-longitude grid 42 . 264 Here the oversampling technique we adopted was similar to early attempts 18,42,43 . Initially, a grid size of 1 × 1 km 2 was 265 chosen. For each overpass, the TROPOMI footprint pixel coverage was calculated and, for computational reasons, 266 approximated as an ellipse on a rectangular latitude-longitude grid. And then, we calculated the area-averaged value using 267 all measurements overpassing the given cell. During this process, we also considered the influences of the footprint size that 268 was set as a weight inversely proportional to the area of each footprint. Nevertheless, a measurement result would be 269 regarded as erroneous values and thus eliminated when it was more than 10 standard deviations from the average 18  Therein , represents the tropospheric NO2 VCDs in the grid ( , ) of 1 × 1 km 2 . , denotes all ground NOx sources, which 280 combines anthropogenic, soil, and biomass burning NOx emissions. represents the ratio of NO2 over NOx concentration. In 281 theory, the daytime NOx chemical system reaches equilibrium rapidly and varies little. In this study, we set to be 0.76 7,14 . 282 The remaining terms in Eq. 1 represent the potential NOx sinks, including chemical, deposition, horizontal, and diffusional 283 loss. Given that the local overpass time of TROPOMI is close to noon (13:30 local time), the chemical NOx sink is 284 dominated by the chemical loss reaction of NO2 with OH, which can be described by a first-order time constant , and, thus, 285 can be estimated from the measured , itself as , , . , indicates the lifetimes associated with deposition and chemical loss 286 in nature. In theory, instantaneous NOx lifetime is dominated by several factors, such as ozone levels and actinic fluxes, and also the NOx concentration itself at the presence of high NOx levels. As a result, the NOx lifetime is linked with season and 288 meteorological conditions. Therefore, the assumption of the NOx lifetime is a major limitation of this method and might be 289 inappropriate for super-emitters 6,7,14,44 . 290 However, previous studies have demonstrated that the assumed lifetime could appropriately relate the measured , to the 291 actual emissions , via mass balance. As shown in previous attempts in Riyadh, the NOx lifetime has been derived to 4 292 hours with an uncertainty of 35% 6 . Its seasonal variations were found to be weak. In the United States and China, a similar 293 method was applied to cities and power plants, resulting in a mean lifetime of 3.8 ± 1.0 hours for May-September 23 . In this 294 study, we utilized a value of 4 hours for all regions. For many of the hotspots, there can be substantial transport out of the 295 box, in which case the effective lifetime would be smaller than 4 hours. Given the high spatial resolution, the emission 296 estimates of a grid cell derived in this way were more likely to be underestimated than overestimated. For specific remote 297 sites, however, 4 hours could be too short. 298 Therefore, we applied conservative assumptions using lifetimes of 1 hour and 24 hours to provide upper-and lower-bound 299 emission estimates. The corresponding uncertainties are considered and shown as error bars in Fig. 3. Although more 300 sophisticated methods, like regional CTM simulations, are available, they would not significantly optimize the super-301 resolution emission estimates for the super-emitters. Alternatively, we could further introduce satellite-based HCHO VCDs 302 into our framework to constrain the OH distributions 45 , which might be useful for optimizing the model. Therein and are the diffusion coefficients in the zonal and meridional directions, respectively. We applied a random 311 walk assumption to derive the diffusion coefficients. The random walk step was assumed to be fixed and equal to the 312 deviation of wind speed in the zonal or meridional direction (̅̅̅ ̅̅̅̅). is one hour, compatible to the sampling temporal 313 interval of the simulated wind information. 314 Note that a background column was first subtracted from the NOx VCDs, so as to include only the emission fluxes of the 315 point sources responsible for the hotspots. The background column was estimated as the 5 th percentile of all the 1 × 1 km 2 316 oversampled grids 7 . The background correction is illustrated in the Supplementary Information. 317 Emissions of super-emitters and cities were also calculated in MEICv1.3 and used to compare with our satellite-based 318 emission estimates. For 2016, this bottom-up inventory was established at a spatial resolution of 0.25° × 0.25° and of five anthropogenic sectors (i.e., agriculture, power plant, industry, residential, and transportation). According to our super-320 resolution results, the inventory was first re-gridded to the 1 × 1 km 2 resolution. For the cities, the emissions in MEICv1.3 321 were then summed over all sectors, while, for the super-emitters, only those of industry and power plant sectors were 322 considered. Specifically, these super-emitters were attributed to industrial hotspots or parks, including the following sectors: 323 power industry, oil refineries, transformation industry, combustion for manufacturing, and process emissions during 324 production and application. Note that only the 1 × 1 km 2 grids that contained the super-emitters were used to calculate the 325 averaged emission estimated. 326 For several reasons, MEICv1.3 may underestimate the emissions of the super-emitters even more than shown in Fig. 3. First,  327 the main reason, as explained above, is attributed to the NOx lifetime, which could be smaller than the 4 hours we assumed, 328 directly resulting in an underestimation of our estimates. Second, depending on the thermal contrast, the TROPOMI 329 instrument, like any other infrared instrument, can miss the lower layer of the atmosphere in which NOx is emitted. Hence, 330 our results are more likely to underestimate NOx VCDs than to overestimate them. 331

Identification and attribution 332
By combining the tropospheric NO2 VCDs, the top-down NOx emission estimates, and the high-quality visible imageries 333 from Landsat 8, we identified the super-emitters manually. Although automated ways with uniform thresholds might result 334 in more consistent and flexible identifications, no satisfactory set of criteria was found that could be applied for the super-335 emitters nationwide. This is mainly because, in localized regions, the NOx budgets respond to the changes in not only the 336 super-emitters but also the background. In addition, the wildfire emergencies generally emitted elevated NOx plumes and 337 especially hamper the identification of hotspots over a large area, even on a one-year average (for example, most of 338 Northeast and Southwest China) 46 . We thus combined the MODIS fire product and the high-quality visible imageries to 339 eliminate the fire hotspots. 340 To identify the super-emitters manually, we identified the localized maxima of both VCDs and emissions, which was 341 significantly higher than the background value. Along with the visible imageries, we could attribute the characteristic 342 enhancement to an industrial hotspot or park. As a result, the typical area with a super-emitter could be identified with a 343 limited geographical scope (i.e., < 5 × 5 km 2 ). Each super-emitter was approximated as a rectangle on the latitude-longitude 344 grid, the central, minimum, and maximum coordinates of which were recorded in Supplementary Table 1. We must highlight 345 that, within densely source areas (e.g., megacities), the emission gradients from numerous super-emitters interfered with 346 each other and were inevitably missed. 347 Although the visible imageries enabled us to ascertain the locations of the super-emitters, it was not possible to identify the 348 industry type directly. Particularly, in China, where the industry is still rapidly developing, the associated bottom-up 349 information may not be assessible. In addition, the Baidu Map allowed us to assign a name to each super-emitter. The usual 350 choice was the name of the specific address recorded in Baidu Map or the name of the nearest geographical area.