Longitudinal association between ambient nitrogen dioxide exposure and all-cause mortality in Chinese adults

A number of population-based studies have investigated long-term effects of nitrogen dioxide (NO 2 ) on mortality, while great heterogeneities exist between studies. In highly populated countries in Asia, cohort evidence for NO 2 -mortality association was extensively sparse. This study aimed to quantify longitudinal association of ambient NO 2 exposure with all-cause mortality in Chinese adults. A national cohort of 30,843 adult men and women were drawn from 25 provincial regions across mainland China, and followed up from 2010 through 2018. Participants’ exposures to ambient air pollutants were assigned according to their residential counties at baseline, through deriving monthly estimates from high-quality gridded datasets developed by machine learning methods. Cox proportional hazards models with time-varying exposures were utilized to assess the association of all-cause mortality with long-term exposure to ambient NO 2 . An approximately linear NO 2 -mortality relation (p=0.273 for nonlinearity) was identied across a broad exposure range of 6.9–57.4 μg/m 3 . Per 10-µg/m 3 increase in annual NO 2 exposure was associated with an hazard ratio of 1.127 (95% condence interval: 1.042–1.219, p<0.003) for all-cause mortality. Risk estimates remained robust after additionally adjusting for the confounding effects of co-pollutants (i.e., PM 2.5 or O 3 ). In 2018, 1.65 million deaths could be attributed to ambient NO 2 exposure (national average 17.3 µg/m 3 ) in China, representing a decrease of 4.3% compared with the estimate of 1.72 million in 2010 (20.5 µg/m 3 ). This cohort study provided national evidence for elevated risk of all-cause mortality associated with long-term exposure to ambient NO 2 in Chinese adults.


Introduction
Ambient nitrogen dioxide (NO 2 ) is a common air pollutant primarily originating from fuel combustion and tra c. As a precursor to ground surface ozone, NO 2 is also involved in the secondary generation of ne particles (PM 2.5 ). In recent decades, health assessments on NO 2 's independent impact raised great research interests across the globe in the context of rapid urbanization and transportation development.
These ndings highlighted the great signi cance and urgency for policy making to quantify the chronic effects of NO 2 exposure on human life expectancy.
During past decades, longitudinal associations between NO 2 exposure and mortality have been investigated in a large number of population-based cohort studies, whereas great heterogeneities were identi ed between regions and studies [9][10][11][12] . The majority of existing studies came from the western world [9,10] including North America and the Europe, suggesting strong evidence for NO 2 -induced risk of all-cause death. In highly populated countries in Asia, cohort evidence for longitudinal NO 2 -mortality association was extensively sparse [9,10] , thus introducing great uncertainty when performing the pooled analysis. In China, only two regional cohorts investigated chronic impacts of NO 2 exposure on mortality, identifying null associations among the Hong Kong elderly [13] but a protective effect in four northeast cities [14] . These mixed ndings warranted more high-quality and large-scale longitudinal investigations across mainland China, for the sake of better understanding of mortality burden due to ambient NO 2 in Chinese population.
In this study, we conceived a nationwide prospective cohort of ~ 30,000 Chinese adult men and women, utilizing population-based survey data through 2010 to 2018 from the China Family Panel Studies spanning across 126 prefecture-level cities. Our primary purpose was to quantify the association of longterm exposure to ambient NO 2 with all-cause mortality in Chinese adults; A secondary purpose was to depict the spatiotemporal patterns of NO 2 -attributable deaths between 2010 and 2018 in China, on the basis of dose-response function derived from our cohort analysis.

Study population and design
Study population in this study was drawn from the China Family Panel Studies (CFPS) [15] , an ongoing national survey across 25 provincial regions in mainland China. Using a multi-stage probability strategy with strati cation [16] , the CFPS baseline survey during April, 2010 through February, 2011 totally included 33,600 adult men and women ages 16-110 years, and follow-up investigations were conducted every two years during 2012-2020. Face-to-face interviews were performed aided by computer-assisted personal interviewing technology, and participants' data (e.g., demographic and socioeconomic characteristics, lifestyle, behavioral patterns, and health status) were thoroughly collected by well-trained investigators through standard questionnaires. The CFPS study has been ethically approved by the Peking University Biomedical Ethics Review Committee (Approval number: IRB00001052-14010), and all participants signed informed consent forms.
In purpose of investigating long-term NO 2 -mortality association, we conceived a longitudinal cohort design using the CFPS baseline (CFPS-2010) and four consequent waves of follow-up data (CFPS-2012(CFPS- , 2014(CFPS- , 2016(CFPS- , and 2018, which were publicly available at the Peking University Open Research Data Platform (https://opendata.pku.edu.cn/dataverse/CFPS/). Death information (e.g., date and cause of death) for deceased CFPS participants were ascertained from their family members in the follow-up interviews. We only considered death events from all causes in this study, because some unknown or indeterminable causes of death were identi ed.
From 33,600 adult participants totally involved at CFPS baseline, we excluded those who had no followup data during 2012-2018 (n = 2549) and objects whose information on death date was invalid or logically erroneous (n = 25). To avoid bias from selection of participants, we also excluded participants who died within the rst year from the baseline interview [17,18] (n = 183). A national cohort of 30,843 adult men and women were nally involved in our study, and distribution of the investigated subjects was geographically illustrated in Fig. 1. 2.2 Exposure assessment and covariates 2.2.1 Exposure assessment CFPS redacted detailed residential address from the public-access data due to privacy considerations, exposure assessments for air pollutants were thus performed at the county level as a proxy. For each CFPS participant, we identi ed a county-level geographic unit using uniform 6-digit administrative codes in mainland China, and linked participants in our cohort to 162 counties in 25 provincial regions [19] . Ground surface concentrations of monthly NO 2 at a 0.25°×0.25° resolution were derived from the fullcoverage and high-quality datasets for near-surface air pollutants in China. Gridded datasets for NO 2 (https://doi.org/10.5281/zenodo.3988349, ChinaHighNO 2 ) were originally generated from the OMI/Aura Tropospheric Column NO 2 products based on satellite remote sensing and machine-learning method (Space-Time Extra-Trees model, STET) [20] . Estimates from ChinaHighNO 2 datasets showed high consistency with nationwide ground monitors across China from 2013 to 2019, with an overall crossvalidation coe cient of determination (CV-R 2 ) of 0.72 and root mean square error (RMSE) of 9.97 µg/m 3 .
Modelling details could be found in our previous publications [20,21] .
Monthly mean ozone concentrations at a 0.1°×0.1° resolution were estimated using a nationwide prediction model based on eXtreme Gradient Boosting (XGBoost) trees [22] . The prediction model was  [22] .
Ground monthly estimates of PM 2.5 concentrations at a 0.1°×0.1° resolution were extracted from a fullcoverage high-resolution air pollutant dataset of Tracking Air Pollution in China (TAP, http://tapdata.org).
As the rst near real-time air pollutant database in China [23] , TAP PM 2.5 developed by Tsinghua University is estimated based on a two-stage random forest model coupled with the synthetic minority oversampling technique and a tree-based gap-lling method [24] . This machine learning-based model exhibits good agreements with the ground monitors, with an out-of-bag CV-R 2 of 0.80-0.88 and RMSE of 13.9 − 22.1 µg/m 3 for different years.

Covariates
In light of prior cohort studies assessing NO 2 -mortality associations [9,10] , we considered a rich set of potential confounders, including demographic characteristics (sex, age, ethnicity, education attainment, urbanicity, marital status, and employment status), lifestyle (smoking, alcohol consumption, physical activity, and sleep duration), health status (body-mass index [BMI], chronic disease prevalence, and depressive symptom), and household characteristics (annual household income, and household air pollution from solid fuels).
Speci cally, ethnicity was de ned as a dichotomous variable indicating Han or minority. Education attainment was grouped into illiteracy, primary or middle school, and high school or above. Smokers and alcohol drinkers referred to former and current status of regular smoking or drinking. Sleep duration was categorized into three groups: <6 h, 6-8 h, and more than 8 h. Physical activity was a dummy variable indicating whether a participant was a regular physical exerciser. Body-mass index was calculated by dividing weight in kilogram by squared height in m 2 , and grouped into < 18.5, 18.5-24, and 24 + kg/m 3 .
Chronic disease prevalence was coded as 1 if a participant had one or more doctor-diagnosed chronic diseases such as hypertension, stroke, and diabetes. Depressive status was measured by the K6 screening instrument of Center for Epidemiological Studies-Depression (CES-D) scale [25] , and de ned as depression episode if the CES-D score is higher than a cut-off of 24. Household income was categorized into three groups of < 15000 RMB, 15000-40000 RMB, and ≥ 40000 RMB generally based on its quartile distribution.

Statistical analysis
Baseline characteristics of study populations were presented as percentages for categorical variables or mean ± standard derivation (SD) for continuous variables. We assessed the long-term association between ambient NO 2 exposure and all-cause death using time-varying Cox proportional hazards models, and quanti ed temporal and spatial changes in estimates of deaths attributable to ambient NO 2 between 2010 and 2018. We did analyses with R version 4.0.3 (R Foundation for Statistical Computing, Vienna, Austria). All tests were two-sided, and p-value < 0.05 was considered to be statistically signi cant.

Main multivariate analysis
Cox proportional hazards models with time-varying exposures on the annual time scale, were used to quantify long-term associations of NO 2 exposure with all-cause mortality. Person-years of follow-up were calculated as the intervals from the dates of baseline interviews (i.e., study enrolment) to the dates of death occurrence, loss to follow-up (the last follow-up interviews during 2012-2018), or the end of CPFS 2018, whichever came rst. Motivated by prior investigations suggesting the potential for biased risk estimates using time-on-study time-scales [26][27][28] , we adopted attained age as the time scale given the interpretability of the hazard ratio as a function of age [29] . All models were strati ed by sex and age at baseline [30] , which was grouped into 10-year intervals (i.e., 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, and ≥ 75) to allow for su cient sample sizes for assessing interactions. We assessed hazard ratios (HRs) of all-cause death and corresponding 95% con dence intervals (CIs) associated with a 10-µg/m³ increase in exposure to annual NO 2 concentration, adjusting for aforementioned confounders.
To investigate the shape of concentration-response (C-R) curve for NO 2 -mortality association, we input the term of annual NO 2 exposure as a restricted cubic spline (RCS) with 3 knots in the multivariateadjusted model. The number of knots for RCS smoothing was selected according to Akaike information criterion and Bayesian information criterion [31,32] (Table S1). Nonlinearity of concentration-response curve was examined through comparing the model t of the linear and RCS models via a likelihood-ratio test [33][34][35] .

Subgroup and sensitivity analyses
We performed subgroup analyses strati ed by sex, age, smoking and drinking status, and physical activity. Two-sample z-tests [34,36] were applied to identify potential effect modi cations. Several sensitivity analyses were conducted to examine the robustness of PM 2.5 -mortality associations estimated in our main analysis. First, to disentangle the effects of co-pollutants, we included two pollutants simultaneously (NO 2 plus PM 2.5 or O 3 ) in an analysis using a consistent annual scale for time-varying exposures. Second, we incorporated time-varying exposure for environmental temperature (e.g., annual average temperature) into our survival analysis to account for the long-term effects of weather conditions. Given emerging evidence with regard to nonlinear effects of temperature on mortality [37] , we adjusted for annual average temperature through an RCS function with 3 knots [38] . Speci cally, we rst derived China's daily gridded temperatures (0.1°×0.1°) during 2010-2018 from the ERA5 climate reanalysis datasets produced by the European Center for Medium-Range Weather Forecasts (https://climate.copernicus.eu/climate-reanalysis), and then calculated county-speci c average temperatures for 162 included counties to estimate participants' time-varying exposures at annual scales.
Third, as an alternative method for selecting covariates [39] , directed acyclic graph (DAG) was adopted was to determine minimal su cient adjustment sets for estimating the total effect of nitrogen dioxide on adult mortality (Fig. S1). Finally, our DAG-based Cox model was strati ed by sex and age and adjusted for educational attainment, household income, and urbanicity.

Attributable deaths due to NO 2 exposure
In line with the methodological strategy widely adopted in prior modelling studies [40,41] , we quanti ed spatiotemporal changes in NO 2 -attributable deaths in China between 2010 and 2018. Based on the doseresponse function estimated by the RCS smoother with 3 knots, we rst derived HR estimates at a range of annual NO 2 concentrations with an interval of 0.1 µg/m³ through referring to a counterfactual exposure of 6.9 µg/m 3 (the lowest county-average level in our cohort) [41] . For a given county (i) in China, attributable deaths (AD i ) due to NO 2 exposure were then calculated via multiplying total observed deaths (TD i ) in a speci c year (e.g., 2010 and 2018) by population attributable fraction (PAF i ) estimated through the formula (HR-1)/HR×100%. [42] Total deaths attributable to ambient NO 2 in China and its provincial regions could be nally yielded by summing up county-speci c AD estimates for a given year. More details for AD calculation and relevant data sources could be found in the supplementary material, as documented in our prior publication [38] . Table 1 Table 2). The included study locations covered a wide span of latitudes across mainland China (Fig. 1), with a substantial temperature range of -2.0-24.2°C in 2010 (Table 2).   [1.101-1.340] for a 10-µg/m 3 rise) associated with NO 2 exposure, and we identi ed a signi cant effect modi cation by sex (p = 0.024). Younger participants, particularly groups aged less than 45 years, were at higher risks when exposed to outdoor NO 2 pollution. NO 2 -mortality associations were observed among smokers and alcohol drinkers only, corresponding to an estimated HR of 1   Table S4.

Discussion
To the best of our knowledge, this is the rst national perspective cohort study investigating mortality effects of long-term exposure to ambient NO 2 in China. By involving 30,000 + adult participants ages 16-110 years, our study provided robust longitudinal evidence for raised death risk associated with NO 2 exposure in Chinese general population. Strati ed analyses revealed potential heterogeneities in NO 2mortality associations between subpopulations, suggesting signi cantly greater vulnerability among men.
In accordance with the up-to-date evidence synthesized from epidemiologic studies, our cohort study identi ed a signi cantly positive association between NO 2 exposure and all-cause mortality. Several toxicological mechanisms may possibly interpret NO 2 -induced hazards. Exposure to ambient NO 2 could promote a systemic vascular oxidative stress reaction [43,44] , and cause endothelial dysfunction, monocyte activation, and certain pro-atherosclerotic changes in lipoproteins, thereby initiating plaque formation, exacerbating disease, and increasing mortality [43] . For a 10-ppb rise in NO 2 exposure, a recent meta-analysis of 28 cohorts estimated a pooled risk of 1.06 (95% CI: 1.04-1.08) in all-cause mortality, whereas substantial heterogeneities existed across studies (HRs ranging between 0.95-1.91). By enrolling a national cohort of 30,843 adults in mainland China, this study associated an excessive death risk of 27.8% (8.8-50.2%) with a 10-ppb increase in annual NO 2 exposure. However, two prior Chinese investigations reported an insigni cant or opposite NO 2 -mortality association in Hong Kong elderly [13] and northern residents [14] . Sources of huge heterogeneity between existing studies still remain not well clari ed, but could possibly be related to great diversity in exposure assessment methods (e.g., xed monitor, land use regression, and satellite-based retrievals), study demographics (e.g., age structure, locations, and NO 2 exposure levels), methodological strategies (e.g., sample size, confounding adjustment, and statistical analysis) [10] .
Using time series data from 398 cities in 22 low to high income countries/regions, Meng and colleagues [8] provided strong evidence for the linear associations between short-term NO 2 exposure and daily total, cardiovascular, and respiratory mortality. In terms of long-term assessments, most international studies failed to investigate the dose-response curve and reported estimates by assuming a linear relation [11] . Our study did not identify evidence (p = 0.273) for the nonlinear effect of NO 2 exposure on all-cause death at a concentration range of 6.9-57.4 µg/m 3 . The Dutch Environmental Longitudinal Study (DUELS) [45] reported a consistent nding (p = 0.37 for nonlinearity) in associations of NO 2 with non-accidental deaths, but found a strongly superlinear relation (p < 0.005) with deaths due to circulatory diseases. Owing to limited longitudinal evidence available for C-R associations between NO 2 and mortality outcomes, the state-of-art global burden of disease (GBD) studies only provided comparative estimates of deaths attributable to ambient PM 2.5 and ozone, irrespective of the potential contribution from NO 2 [46] . Using a counterfactual analytic framework adopted by GBD, we attributed 1.65 million deaths to NO 2 exposure in China for the year 2018 only, accounting for 112% of GBD 2019 estimate (1.47 million) due to ambient PM 2.5 and ozone for Chinese population [46] . NO 2 -mortality cohort studies focusing on C-R analyses are thus warranted across the world, particularly in developing countries, so as to facilitate more comprehensive GBD estimation of disease burden attributable to ambient air pollution.
Our study observed an elevated death risk associated with NO 2 among adult men only, showing a signi cant effect modi cation by sex. This nding could be partially explained by great differences in biological heredity between sex and intensity of work-related exposures to outdoor air pollution. Stronger NO 2 -mortality associations were also estimated for men in two European cohorts [47] , while DUELS [45] reported signi cantly greater vulnerability among women when assessing mortality causes of respiratory disease and lung cancer. We found a tendency for higher NO 2 -related HR in younger age groups (16-59 years), which was in agreement with evidence of effect modi cation by age as highlighted in two large cohorts (over 1 million participants) in Rome [47] and Netherlands [45] for nonaccidental deaths. As demonstrated in our strati ed analyses by behavioral factors, only ever/current smokers and alcohol drinkers were at greater death risks induced by NO 2 exposure, whereas no clear evidence for modifying effects was identi ed in our study population. In addition, we estimated highly comparable NO 2associated hazards between those having and not having regular physical activity. Combined effects of air pollution and physical activity on mortality raised great research interest but remained not yet fully understood [48] . In several existing large cohorts of US women [48] (n = 104,990), middle-aged Danish adults [49] (n = 52,061), and Hong Kong elderly [50] (n = 66,820), no signi cant interactions between NO 2 /PM 2.5 exposure and physical activity were reported in associations with total and cardiovascular mortality.
This study has several strengths. First, this cohort investigation provided the rst nationwide epidemiologic evidence for long-term association between NO 2 exposure and adult mortality in mainland China. Second, exposure assessment in our analysis was based on high-resolution NO 2 prediction models, through taking advantages of satellite retrieved estimates and machine learning methods. This novel advance in methodology could largely reduce exposure errors as compared to the majority of prior cohort studies relying on measurements from xed-site monitors or estimates derived from geospatial statistical methods and chemistry transport models [10] . Additionally, our analyses took into account a rich set of confounders including individual lifestyles and provided robust NO 2 -mortality evidence by including high-resolution PM 2.5 /O 3 for additional adjustment.
Some limitations of our analyses should also be noted. First, participants' NO 2 exposures were assessed at the county level rather than assigned based on residential addresses, which may result in some inevitable exposure misclassi cations. Second, high-quality NO 2 estimates at ner spatial-temporal scales were still of wide lack globally and regionally [51] , which has hampered the comparative analyses using various exposure datasets. Third, owing to data unavailability, we failed to account for residential mobility of study participants in CFPS follow-up surveys during 2012-2018. Finally, cause-speci c analyses for NO 2 -mortality associations were not performed due to a relative high proportion of indeterminable causes of death in the CFPS database. This limitation may introduce some di culty in direct comparison of estimates of NO 2 -attributable deaths between our calculation based on all-cause deaths and assessments through summing up cause-speci c contributions [38] .

Conclusions
In summary, this study associated elevated risk of adult mortality with long-term exposure to ambient NO 2 , utilizing an 8-year nationwide cohort in mainland China. Our dose-response analysis highlighted an approximately linear NO 2 -mortality curve, which provided a valuable opportunity to quantify mortality burden attributable to NO 2 exposure in Chinese general population. Future population-based cohorts should well take advantages of high-quality exposure datasets, so as to enhance the understanding of NO 2 -induced health risks and promote the comprehensive assessment of regional and global disease burden due to ambient NO 2 .