An Exposome-Wide Association Study On Body Mass Index In Adolescents Using The National Health And Nutrition Examination Survey (NHANES) 2003-2004 And 2013-2014 Data



Background: Excess weight is a public health challenge affecting millions worldwide, including younger age groups. The human exposome concept presents a novel opportunity to comprehensively characterize all non-genetic disease determinants at susceptible time windows.

Objective: Our study aimed to describe the association between multiple lifestyle and nutritional exposures and body mass index (BMI) in adolescents using the exposome framework.

Methods: We conducted an exposome-wide association study using the U.S. National Health and Nutrition Examination Survey (NHANES) 2003-2004 survey for discovery of associations between the study population characteristics and BMI, and the 2013-2014 survey to replicate analysis. We included non-diabetic and non-pregnant adolescents aged 12-18 years. We analyzed variables available in both survey rounds, with <20% of missing values in relation to BMI z-scores. We performed univariable and multivariable linear regression analysis adjusted for age, sex, race/ethnicity, household smoking, and income to poverty ratio, and corrected for false-discovery rate (FDR). 

Results: A total of 1899 and 1224 participants were eligible from the 2003-2004 and 2013-2014 survey waves. The weighted proportions of overweight were 18.4% and 18.5% whereas those for obese were 18.1% and 20.6% in 2003-2004 and 2013-2014, respectively. Retained exposure agents included 63 dietary, 57 clinical and 1 physical activity variables. After FDR correction, univariable regression identified 35 and 18 predictors in the discovery and replication datasets, respectively, while multivariable regression identified 20 and 9 predictors in the discovery and replication datasets, respectively. Seven were significant in both datasets: alanine aminotransferase, gamma glutamyl transferase, mean cell volume, segmented neutrophils number, triglycerides; uric acid and white blood cell count.

Discussion: This is the first ExWAS study in NHANES describing associations between zBMI, nutritional and clinical factors in adolescents. Future studies are warranted to investigate the role of the identified predictors as early-stage biomarkers of increased BMI and associated pathologies among adolescents and to replicate findings to other populations. 


Overweight and obesity are described as an excessive adiposity or fat accumulation, often eliciting multiple health conditions leading to health impairments or even death (Ofei 2005; Djalalinia et al. 2015; WHO 2021). Trends of overweight and obesity have been increasing over the past three decades, globally and across all age groups(Ng et al. 2014). In a systematic analysis of global studies, the prevalence of overweight and obesity was estimated to have rose by 27.5% for adults and 47.1% for children (2–19 years) between 1980 and 2013, hence, leading to an estimated increase in the number of overweight and obese individuals from 857 million to 2.1 billion during the same period (Ng et al. 2014). In 2016, at least 340 million children and adolescents (5–19 years) were overweight or obese as per the World Health Organization (WHO) estimates (Obesity and overweight). In the United States alone, prevalence estimates of overweight and obesity among children and adolescents (2–19 years) also increased across race/ethnicities from 1966–1967 through 2017–2018, reaching 16% and 19%, respectively, or approximately 14.4 million in total (Fryar et al 2020).

The physiological complications of overweight and obesity in children, adolescents and adults are well documented in the literature (Hruby et al. 2016; Lee and Yoon 2018). Adult onset of type 2 diabetes, hypertension, cardiovascular diseases, pulmonary, endocrine and gastrointestinal disorders as well as premature death were identified as major consequences of childhood and adult obesity (Lee and Yoon 2018; Reilly and Kelly 2011).

Obesity in adolescents tends to persist in adulthood. In a 12-year prospective study in Slovenia, more than half of 7-year-old overweight and obese children remained overweight or obese at the age of 18 (Starc and Strel 2011). In a longitudinal follow-up of individuals enrolled in 1996 through 2007–2009 surveys of US National Longitudinal Study of Adolescent Health, obese adolescents were 16 times more likely to develop severe adulthood obesity than normal or overweight adolescents (The et al. 2010). Very high adolescent BMI (≥ 85th percentile) was associated with 30–40% higher adult mortality rate, as shown in a longitudinal assessment of national health surveys conducted in Norway between 1963 and 1999 (Engeland et al. 2004).

Overweight and obesity are multifactorial with numerous identified risk factors. Genetic susceptibility is considered one of these risk factors, yet monogenic or polygenic forms of obesity are rare and complex interactions between genetic and environmental factors are biologically plausible (Clément and Ferré 2003; Lee and Yoon 2018; Speiser et al. 2005). While the genetic factors are measured in genome-wide association studies (GWAS), exposome studies are needed to capture all environmental factors complementary to the genome that influence health (Wild, 2005). In the case of obesity and overweight, dietary behaviors (e.g., increased fat and calorie intake), lifestyle preferences (e.g., low physical activity, increased screen time), and psychological states (e.g., emotional stress and maladaptive coping strategies) are all considered possible risk factors of obesity onset and projection and their comprehensive examination is warranted for developing effective disease prevention programs (Lee and Yoon 2018).

In an approach similar to GWAS, exposome-wide association studies (ExWAS) are an agnostic, untargeted and hypothesis-generating approach aiming to identify associations between environmental factors and diseases outcomes (Gangler et al. 2019; Haddad et al. 2019; Patel et al. 2010; Zheng et al. 2020). The ExWAS approach was firstly used to explore the association of numerous nutrition and environmental factors with various health outcomes in adults, such as abdominal obesity (Wulaningsih et al. 2017), blood pressure (McGinnis et al. 2016), diabetes mellitus (Patel et al. 2010) and telomere length (Patel et al. 2017). The application of ExWAS in studying excess weight in younger age groups is still limited to a single study investigating the association between polycyclic aromatic hydrocarbons with childhood obesity among 6–17 years participants using data from the 1999–2016 National Health and Nutrition Examination Survey (NHANES) surveys (Uche et al. 2020).

Given the public health significance of the rising trends of overweight and obesity in younger age groups, worldwide, comprehensive, data-driven and hypothesis-generating approaches are warranted to elucidate the role of environmental origins of obesity. Using data from the NHANES datasets in 2003–2004 and 2013–2014, this study aimed to explore the association between multiple lifestyle and nutritional exposures and changes in body mass index (BMI) of U.S. adolescents aged 12–18 years, using the exposome framework.


Study population

We used the National Health and Nutrition Examination Survey (NHANES) data collected in the 2003–2004 and the 2013–2014 survey waves. All methods were performed in accordance with the relevant guidelines and regulations. We included participants between 12 and 18 years old at the time of the interview, who reported not being pregnant and not being told by a doctor or health professional to have diabetes, with non-missing BMI value. The selection of two surveys ten years apart aimed to allow the description of exposome profile variations among adolescents in the United States.

The NHANES is a program of studies conducted every two years by the U.S. Center for Disease Control and Prevention (CDC) in a nationally representative multistage probability sample of non-institutionalized population living in the United States (Centers for Disease Control and Prevention (CDC) 2020). Each of the cross-sectional datasets includes demographic, dietary and health-related questions collected during the interview, in addition to laboratory tests, medical, dental and physiological measurements collected during the physical examination at the mobile examination centers (Centers for Disease Control and Prevention (CDC) 2020).

Study outcome

The study outcome was age- and sex-standardized body mass index (BMI) z-scores calculated using standard deviation scores from the U.S. CDC 2000 growth chart (Ogden and Flegal 2010). To describe weight status category, we used the 2010 CDC terminology for children overweight and obesity. The BMI z-score percentiles were used to classify children as being “underweight” (< 5th percentile); “healthy weight” (5th percentile < BMI z-score < 85th percentile); “overweight” (85th percentile ≤ BMI z-score ≤ 95th percentile); and “obese” (> 95th percentile) (Centers for Disease Control and Prevention (CDC) 2019).

Selection of explanatory variables

We screened the data files from each of the five NHANES sections (i.e., demographics; dietary; examination; laboratory and questionnaire data) of 2003–2004 and 2013–2014 surveys. In the initial screening, we selected variables: i) available in both surveys with similar responses; iii) targeting adolescents (12–18 years old); and iv) being relevant to the objective of the study (i.e., excluding: language spoken at home, audiometry examinations, dermatology, etc.). As a result, we retained 121 variables belonging to the following NHANES sections: dietary (63), laboratory data (57), and questionnaire (1) (Fig. 1).

The 63 dietary variables referred to those collecting information about the dietary intake consumed during the 24-hour period prior to the interview. Laboratory variables comprised the following measurements: serum and urinary albumin, serum cotinine, standard biochemistry, complete blood count and glycohemoglobin (n = 45); urinary measurements of phthalates (n = 10) and urinary environmental phenols (n = 2). The physical activity variable was retained from the questionnaire section.

Next, variables with more than 20% of missing values in each of 2003–2004 and 2013–2014 datasets were dropped from the analysis. Since phthalates and urinary environmental phenols measurements had at least 67% of missing values in each of the 2003–2004 and 2013–2014 datasets, they were dropped from the rest of analysis.

Variables transformation

All variables were evaluated as either continuous or categorical. For dietary and laboratory continuous variables, zero values were substituted with 1e− 10; log-transformed, then scaled and centered. When necessary, categorical variables were recoded for a harmonized coding between the two survey waves. For physical activity, a positive answer to vigorous physical activity “over past 30 days” (in 2003–2004 dataset) and “during a typical week” (in 2013–2014 dataset) were treated similarly. Particularly, participant’s educational level was categorized according to NHANES coding available in 2003–2004 demographic questionnaire: “Less than high school”, “High school including GED” and “More than high school”. We kept the grouping of race/ethnicity as included in NHANES questionnaire: Mexican American, Non-Hispanic White, Non-Hispanic Black, other Hispanic and other ethnicities. Age and poverty income ratio (PIR) - a ratio of family income to poverty threshold - was treated as a continuous variable. Household smoking was coded as “Yes/No”; with an answer of 1 or more smoking household members was recoded as “Yes” in the 2013–2014 dataset to match the variable assessment in the 2003–2004 dataset.

Statistical analysis

To account for the complex survey design, non-response and post-stratification, survey-weighted analysis was conducted using the appropriate sample weight as per NHANES analytical guidelines (CDC 2021). For each survey wave, we created survey design datasets comprising the variables retained from each section along with their assigned sample weight; the demographic variables with the interview weight (wtint2 year), the laboratory variables and for the physical activity with the Mobile Examination Center (MEC) exam weight (wtmec2 year) and the dietary variables with the dietary day 1 sample weight (wtdrd1). For correlation between all explanatory variables, we applied the least common denominator approach and hence we used the weights of the dietary variables.

Descriptive statistics and correlations between all explanatory variables were calculated separately for the discovery and replication datasets. Survey-weighted regressions were conducted to explore the associations between each of the explanatory variables and BMI z-score (univariable) and after adjusting for age, sex, race/ethnicity, household smoking and income to poverty ratio (multivariable). We applied the Benjamini-Hochberg method (Storey and Tibshirani 2003) to generate false discovery rate (FDR) adjusted p-values separately for the univariable and multivariable analyses per wave. Predictors with FDR adjusted p-value < 0.05 were considered significant. Next, we identified predictors that were commonly significant in the multivariable analysis of both the discovery and replication datasets.

To evaluate the possible interaction between sex, the statistically significant predictors retained in the previous step and zBMI, we repeated the multivariable analysis by adding the interaction term, separately for the discovery and replication datasets.

Additionally, a sensitivity analysis was conducted by repeating univariable and multivariable regression on each of 2003–2004 and 2013–2014 datasets. All of the analysis was conducted using R version 4.0.4 (R-project 2021) using the R studio version 1.4.1103 (RStudio). The scripts used in the analysis as well as the detailed output of the analysis are available in the supplementary information. All survey-weighted descriptive and explanatory analysis was conducted using R survey package.


Descriptive characteristics

Following the selection criteria, 1899 and 1224 participants in total, were eligible from the 2003–2004 and 2013–2014 survey waves, respectively (Fig. 2). The weighted mean age of adolescents was similar across the two surveys, with a mean of 15 (standard error (SE) 0.1) years of age. Similarly, almost half of our study population was males; non-Hispanic whites (64.5% and 54.2% in 2003–2004 and 2013–2014 survey wave, respectively) having less than high school education (90.5% and 91.8% in 2003–2004 and 2013–2014 survey wave, respectively). The mean poverty income ratio (PIR) was similar between the two surveys with a weighted mean of 2.6 (0.1 SE) and 2.4 (0.1 SE) in 2003–2004 and 2013–2014, respectively. The weighted proportion of reported household smoking in 2003–2004 was 25.1% (95%CI: 19.9%, 30.4%) and 23.6% (95%CI: 17.5%, 29.6%) for the 2013–2014 survey wave.

The weighted proportions of BMI categories were similar across the two surveys: “underweight” (2% in both survey waves); “healthy weight” (61.4% and 58.5% in 2003–2004 and 2013–2014, respectively); “overweight” (18% in both survey waves), and “obese” (18.1% and 20.6% in 2003–2004 and 2013–2014, respectively) (Table 1). No statistically significant (p > 0.05) differences were noted in the distribution of BMI categories between sexes in both survey waves (Supplementary information S1).

Table 1

Estimated (weighted) background characteristics of the study population from the 2003–2004 and 2013–2014 NHANES surveys




Age (mean, (standard error))

15 (0.1)

15 (0.1)

Sex (% [95%CI])



50.7 (48.3%,53.2%)

51.8 (47.9%, 55.7%)


49.3 (46.8%, 51.7%)

48.2 (44.3%, 52.1%)

Race/Ethnicity (% [95%CI])


Mexican American

11 (5.4%, 16.5%)

15.4 (8.9%, 22%)

Non-Hispanic black

14.7 (9.7%, 19.8%)

14.3 (9.6%, 19%)

Non-Hispanic white

64.5 (54.8%, 74.2%)

54.2 (43.3%, 65%)

Other Hispanic

4.9 (2.3%, 7.4%)

6.9 (4.4%, 9.4%)

Other ethnicity

4.9 (2.9%, 6.8%)

9.2 (6.6%, 11.8%)

Education level


Less than high school

90.5 (86.9%, 94.1%)

91.8 (89.3%, 94.3%)

More than high school

3.6 (1.9%, 5.2%)

1.6 (0.7%, 2.5%)

High school diploma including GED

5.9 (2.8%, 9%)

6.6 (4.4%, 8.8%)

Income to Poverty Ratio (mean, (standard error))

2.6 (0.1)

2.4 (0.1)

Household smokers (% [95%CI])



25.1 (19.9%, 30.4%)

74.9 (69.6%, 80.1%)

23.6 (17.5%, 29.6%)

76.4 (70.4%, 82.5%)

Body Mass Index (% [95%CI])



2.2 (1.1%, 3.2%)

2.4 (1.1%, 3.8%)

Healthy weight

61.4 (55.7%, 67.1%)

58.5 (53.8%, 63.2%)


18.4 (15.3%, 21.5%)

18.5 (15.5%, 21.5%)


18.1 (14.1%, 22.1%)

20.6 (16.2%, 24.9%)

Exploratory ExWAS analysis using the 2003–2004 and 2013–2014 dataset

The univariate zBMI regression models for each of the 109 variables in the 2003–2004 dataset resulted in 67 significant predictors, of which only 35 remained significant after FDR correction: 21 variables of laboratory measurements (alanine aminotransferase (ALT); albumin; bicarbonate; total bilirubin; cholesterol; gamma glutamyl transferase (GGT); serum glucose; serum iron, lactate dehydrogenase; lymphocyte number; mean cell hemoglobin; mean cell volume; monocyte percent; platelet count; red blood cell count; segmented neutrophils number; sodium ; total calcium; triglycerides; serum uric acid, and white blood cell count) and 14 dietary variables (carbohydrates; dietary fiber; folate as dietary folate equivalents; iron; lutein and zeaxanthin; magnesium; niacin; retinol; riboflavin (vitamin B2); thiamin (vitamin B1); total folate; total sugars; vitamin B6, and zinc). All significant predictors (FDR adjusted p-value < 0.05) were negatively associated with zBMI, except for the following: ALT; cholesterol; GGT; serum glucose; lactate dehydrogenase; number of lymphocytes; platelet count; red blood cell count; segmented neutrophils number; triglycerides; serum uric acid, and white blood cell count (Supplementary information S2).

Multivariable regression resulted in 44 significant predictors of which only 20 remained significant after FDR correction: 14 laboratory variables (ALT; albumin; bicarbonate; GGT; lymphocyte number; mean cell hemoglobin; mean cell volume; phosphorus; platelet count ; red blood cell count; segmented neutrophils number; triglycerides; serum uric acid; white blood cell count) and 6 dietary variables (folate as dietary folate equivalents; iron; riboflavin (Vitamin B2); total sugars; vitamin B6; and zinc (Table 2). Nine of these significant predictors were positively associated with BMI z-score (ALT; GGT; lymphocyte number; platelet count; red blood cell count; segmented neutrophils number; triglycerides; uric acid and white blood cell count).

Table 2

Variables significant in multivariable linear regressions adjusted for age, sex, race/ethnicity, household smoking and income to poverty ratio in 2003–2004 and 2013–2014 survey datasets


2003–2004 discovery dataset

2013–2014 replication dataset


Estimate (S.E.)


FDR adjusted 


Estimate (S.E.)


FDR adjusted 




Alanine aminotransferase (ALT,U/L)

0.383 (0.044)

< 0.001


0.405 (0.039)

< 0.001


Gamma glutamyl transferase (GGT, U/L)

0.341 (0.043)

< 0.001


0.416 (0.034)

< 0.001


Mean cell volume (fL)

-0.174 (0.035)



-0.208 (0.046)



Segmented neutrophils number (1000 cell/uL)

0.211 (0.047)



0.273 (0.057)



Triglycerides (mmol/L)

0.285 (0.038)

< 0.001


0.355 (0.032)



Uric acid (umol/L)

0.452 (0.046)

< 0.001


0.494 (0.066)

< 0.001


White blood cell count (1000 cells/uL)

0.188 (0.036)



0.269 (0.054)



Albumin (g/L)

-0.263 (0.061)




Bicarbonate (mmol/L)

-0.181 (0.047)




Lymphocyte number

0.102 (0.027)




Mean cell hemoglobin (pg)

-0.163 (0.034)




Phosphorus (mmol/L)

-0.193 (0.042)




Platelet count SI (1000 cells/uL)

0.17 (0.034)




Red blood cell count (million cells/uL)

0.207 (0.038)




Lactate dehydrogenase (U/L)


0.191 (0.035)



Monocyte number (1000 cells/uL)


0.215 (0.034)



Nutrition (day1)


Folate, DFE (mcg)

-0.16 (0.042)




Iron (mg)

-0.146 (0.037)




Riboflavin (Vitamin B2) (mg)

-0.17 (0.033)




Total sugars (gm)

-0.153 (0.037)




Vitamin B6 (mg)

-0.129 (0.033)




Zinc (mg)

-0.131 (0.034)




For the replication dataset (2013–2014), the univariable regression models resulted in 31 significant predictors of which only 18 remained significant after correction for the false discovery rate: 17 laboratory variables (ALT; albumin; GGT; globulin; serum iron; lactate dehydrogenase; lymphocyte number; mean cell hemoglobin; mean cell volume; monocyte number; platelet count; segmented neutrophils number; red cell distribution width; total bilirubin; triglycerides; serum uric acid; white blood cell count) and 1 dietary variable (total sugars) (Supplementary information S2). Multivariable regression resulted in 32 significant predictors of which only 9 laboratory variables remained significant after FDR correction: ALT; GGT; lactate dehydrogenase; mean cell volume; monocyte number; segmented neutrophils number; triglycerides; serum uric acid and white blood cell count (Table 2). Only the mean cell volume had an inverse significant association with zBMI. Volcano plots showing the distribution of the model estimates per predictor used in the ExWAS analysis are available in Figs. 3 and 4.

Seven variables were commonly significant and FDR corrected in multivariable analysis of both the discovery and replication datasets: ALT; GGT; mean cell volume; segmented neutrophils number; triglycerides; serum uric acid and white blood cell count. In the second model for multivariable analysis that accounts for interaction between sex and each of the aforementioned predictors, the interaction term was significant for serum uric acid (estimate = 0.245, p = 0.022) and ALT (estimate = 0.248, p = 0.035) in the discovery dataset, and for ALT (estimate = 0.185, p = 0.03) only in the replication dataset (Figs. 5, 6). The full model estimates for variables with significant interaction terms are available in the supplementary information. Details on the regression analysis results can be found in the Supplementary Information S2.

Sensitivity analysis

Sensitivity analysis was conducted on a sample of 1540 and 966 non-obese adolescents from 2003–2004 and 2013–2014 datasets. None of the predictors were significant at univariable and multivariable analysis in 2003–2004 survey. For 2013–2014 survey, 6 laboratory variables (ALT; albumin; blood urea nitrogen; GGT; triglycerides and white blood cell count) and 3 dietary variables were significant at the univariable analysis after FDR correction (cholesterol; folic acid and vitamin B12); however, none remain significant after adjusting for covariates in the multivariable analysis.


2003–2004 dataset: When examining the correlations between dietary and laboratory variables, weak coefficients were found overall; a maximum coefficient of r = 0.217 was noted for the correlation between selenium and blood urea nitrogen.

Looking specifically at the correlation between serum uric acid and both laboratory and dietary predictors; the correlation between uric acid and each of serum creatinine and hemoglobin showed the highest coefficients of 0.47 and 0.44, respectively, followed by a coefficient of 0.33 for the correlation between uric acid and GGT. On the other hand, the correlation between GGT and ALT had a coefficient of 0.52. A strong correlation was found between mean cell hemoglobin and mean cell volume (r = 0.94). A medium correlation was found for the following associations: Albumin and total calcium (r = 0.56); Albumin and total protein (r = 0.55); total bilirubin and iron (r = 0.42); Triglycerides and total cholesterol (r = 0.35), while the association between GGT and platelet count was weak (r = 0.13) (Supplementary information S2).

2013–2014 dataset: Similarly, weak coefficients were found when examining the correlation between dietary and laboratory variables, as noted in the correlation between selenium and each of protein (r = 0.28); phosphorus (r = 0.27) and blood urea nitrogen (r = 0.27). Medium correlations were noted for: GGT and ALT (r = 0.54); uric acid and each of creatinine (r = 0.45), hemoglobin (r = 0.39) and hematocrit (r = 0.39) (Supplementary information S2).


This is the first ExWAS study in NHANES describing associations between zBMI, nutritional and clinical factors in adolescents. In effect, we conducted an exposome-wide association study exploring 121 explanatory variables with respect to (sex-specific) BMI-for-age in non-diabetic and non-pregnant adolescents aged 12–18 years old in the 2003–2004 NHANES survey and used the 2013–2014 survey for validation. After adjusting for age, sex, race-ethnicity, household smoking, and income to poverty ratio, we found ALT, GGT, mean cell volume, segmented neutrophils number, triglycerides, serum uric acid and white blood cell count to have statistically significant associations with zBMI after adjusting for FDR in both the discovery and replication datasets, being in line with literature findings for our study age group.

Uric acid is the end product of purine metabolism in humans and its magnitude depends on dietary purines (from animal proteins, meat, seafood, beer and fructose sources), the degradation of endogenous purines as well as the renal and intestinal excretion of urate (Gustafsson and Unwin 2013). Although serum uric acid levels increase differently by sex from birth till adolescence, there is still no universally accepted threshold for defining hyperuricemia, or excess concentrations of serum uric acid in children and adolescents (Kubota 2019). Elevated levels of uric acid have been associated with obesity and non-communicable diseases, such as kidney and cardiovascular diseases in children and adolescents (Bussler et al. 2017; Kubota 2019). Recent literature emphasized the association between uric acid and metabolic syndrome (MetS) outcomes in children and adolescents, such as glucose intolerance, central obesity, hypertension, and dyslipidemia (Bussler et al. 2017; Goodman 2020; Kong et al. 2013). Uric acid was associated with the prevalence of metabolic syndrome and its components, as shown in a cross-sectional analysis of 1370 adolescents (12–17 years of age) using data from NHANES 1999–2002; the unweighted prevalence of metabolic syndrome was ≈ 21% in the highest quartile (> 339 µmol/L) as compared to ≈ 10% in the third quartile (≤ 339 µmol/L), ≈ 4% in the second quartile (≤ 291 µmol/L) and < 1% among participants in the unweighted lowest quartile of serum concentrations of uric acid (≤ 250 µmol/L) (Ford et al. 2007). A similar distribution of serum uric acid levels was observed in this study targeting non-diabetic adolescents for the 2003–2004 dataset (median: 297 µmol/L, interquartile range (IQR): [250 µmol/L, 351 µmol/L]) and 2013–2014 dataset (median: 297.4 µmol/L, IQR: [244 µmol/L, 351 µmol/L]). In the adjusted multivariable analysis, the strongest adjusted effect size of the association between various exposomic variables with zBMI was observed for uric acid in both discovery (estimate = 0.452) and replication (estimate = 0.707) analyses of both surveys. After excluding non-obese participants, the association between zBMI and uric acid was no longer statistically significant, although it still showed the strongest effect size in both discovery and replication subsets (adjusted estimates of 0.24 and 0.39, respectively). Thus, our findings highlight the pathogenic role of elevated concentrations of uric acid in young obese age groups, as showcased in different studies. In a case-control study conducted in Italy among 120 children and adolescents with primary obesity (zBMI ≥ 97th percentile) and 50 healthy controls, carotid intima-media thickness was significantly correlated (r = 0.61; 95% CI, 0.58–0.64) with the fourth quartile of uric acid among obese children regardless of the presence of metabolic syndrome, defined in the study as ≥ 3 or more of the following criteria: obesity, hypertension, low HDL cholesterol, elevated triglycerides, and impaired fasting glucose and/or insulin resistance (Pacifico et al. 2008). On the other hand, the association between serum uric acid and cardiovascular diseases, irrespectively of BMI, has also been documented in a study conducted on an 1999–2006 unweighted NHANES sample of 12–17 years old adolescents; after adjusting for age, sex, race/ethnicity and BMI, the odds of having elevated blood pressure (mean systolic and/or diastolic blood pressure percentile ≥ 95th percentile) was 1.38 (95% CI, 1.16 to 1.65) for each 0.1 mg/dL increase in uric acid (Loeffler et al. 2012). In a randomized, double-blinded trial among pre-hypertensive obese adolescents (11–17 years old), patients treated with two mechanisms of urate reduction (allopurinol and probenecid) did not continue to gain weight during the 3-months study period and showed a similar and significant reduction in their systolic blood pressure by 10.2 mm Hg and their diastolic by 9.0 mm Hg in the two treatment groups as compared to the placebo group; thus highlighting the role of uric acid as a biochemical mediator of increased blood pressure (Soletsky and Feig 2012).

The observed association of both uric acid and GGT with obesity and other cardiovascular risk factors has been previously documented in the literature. In a cross-sectional study on 2067 children and adolescents (6–20 years) in Hong Kong, a combined effect of the upper quartiles of both uric acid and GGT in association with obesity, low high-density lipoprotein cholesterol (HDL-C) level and high blood pressure (adjusted odds ratios ranged from 1.63 to 5.82, all p < 0.005) (Kong et al. 2013). GGT, a liver enzyme implicated in the degradation of glutathione is associated with BMI, total cholesterol, diabetes mellitus (all components of MetS) as well as cardiovascular disease and all-cause mortality in adults (Mason et al. 2010). The correlation between GGT and MetS and hypertension among younger age groups was demonstrated in a 10-year longitudinal study in Taiwan, where subjects (10–15 years) with higher baseline levels of GGT were at least twice more likely to develop MetS and hypertension during the follow-up period (Lin et al. 2017). In our analysis, we showed a positive correlation, albeit weak, between uric acid and GGT in both discovery and replication datasets, without adjustment for zBMI.

Also, ALT had a statistically significant association with zBMI in the multivariable analysis using both the discovery and replication datasets. ALT is a liver enzyme related to fat liver accumulation and considered a useful biomarker for non-alcoholic fatty liver disease (NAFLD) (Liu et al. 2014). The association between serum ALT and zBMI was previously documented in a study among adolescents (12–18 years) from NHANES III (1988–1994), in which overweight and obese study subjects were three to six times more likely to have higher levels of ALT (> 30 U/L) as compared to those with normal weight (Strauss et al. 2000). Similarly, a significant correlation between ALT, zBMI and metabolic syndrome was found among 5411 adolescents aged 12–19 years from NHANES 1999–2014; yet, with no significance increase in the prevalence of increased ALT over time (Fermin et al. 2017). On the other hand, elevated serum ALT levels (> 40 U/L) were also associated with markers of metabolic syndrome, as demonstrated in a study among adolescents 10–19 years old from the Korean National Health and Nutrition Examination Survey 1998 (Park et al. 2005).

Elevated triglycerides or hypertriglyceridemia is common among obese children and adolescents, and this component of metabolic syndrome is a known biomarker of cardiovascular disease risk (Jung and Yoo 2018). In our analysis, a positive association between triglycerides and zBMI was found in each of the discovery (estimate = 0.285) and replication dataset (estimate = 0.444), but not in the sensitivity analysis in non-obese adolescents. This positive association found in our ExWAS study between triglycerides and zBMI is in line with the findings of a study on abnormal lipid levels among adolescents (12–19 years) in NHANES 1999–2006, in which 22% of overweight and 43% of obese had at least one abnormal lipid level including elevated triglycerides, the most common lipid abnormality associated with excess weight (Centers for Disease Control and Prevention (CDC) 2010).

The inverse association between mean cell volume and zBMI found here was previously documented in a study of 210 female adolescents (12–17 years) using NHANES 2003–2004; lower mean cell volume, transferrin saturation, and higher serum transferrin receptor were found among overweight and obese female study participants, indicative of iron deficiency; these findings, consistent with those among obese adults, suggest that obesity-associated anemia reported in adults and children also occur in female adolescents (Tussing-Humphreys et al. 2009).

Our analysis also showed the association between zBMI and inflammatory markers, such as white blood cell count and segmented neutrophils number. Such findings were also documented among adolescents (Hsieh et al. 2007; Reyes et al. 2015; Wu et al. 2010), suggesting that obesity-induced inflammation could start in early childhood (Singer and Lumeng 2017).

None of the dietary variables remained significant at multivariable analysis in the discovery and replication datasets. Additionally, weak correlations were found between the laboratory and dietary variables in each of the discovery and replication datasets.

The associations found between uric acid, GGT or ALT with zBMI using the whole study population were no longer statistically significant in the sensitivity analysis that included non-obese adolescents. This observation warrants for further investigation on the potential use of these biochemical parameters as biomarkers in the early stages of obesogenesis in adolescence or childhood. The sex specific trends observed in the association between the three aforementioned biochemical parameters and zBMI are also worth of detailed investigation in other population studies as they might be useful in future obesity screening and prevention programs in adolescence and/or earlier life stages.

The strength of this study lies in the agnostic nature of the ExWAS approach which allows for the simultaneous assessment of multiple parameters and their associations with different outcomes. The NHANES dataset is considered representative of the US population; in effect, the weighted estimates of overweight and obesity prevalence in this US study population (12–18 years old) were similar in the 2003–2004 (18.4% and 18.1%, respectively) and 2013–2014 survey datasets (18.5% and 20.6%, respectively). Moreover, the obesity prevalence estimates in both datasets were similar to the weighted estimates by the U.S. CDC of 17.4 % (13.9% − 21.3%) for the 2003–2004 survey and 20.6% (16.2% − 25.6%) for the 2013–2014 survey (Ogden et al., 2016). The created models are considered as robust, being tested on two NHANES datasets, ten years apart. Another strong feature of this ExWAS study was the inclusion of variables belonging to all exposome domains; the general external (individual household income), the specific external (dietary variables; education; household smoking and physical activity) and the internal domain (intrinsic and laboratory variables). Yet, in order to fully explore the exposome’s utility, it is encouraged to include additional groups of environmental components in relation to the studied health outcome (Haddad et al. 2019).

Due to the cross-sectional study design of NHANES causal associations cannot be established and although the approach was as inclusive as possible, not all exposome parameters were available or could be included, e.g. chemical exposure data were not fully available in these NHANES surveys. In addition, dietary assessment using self-reported food frequency might be subject to recall bias, underreporting, over reporting, or omission of foods (Raatz et al. 2017); furthermore, because of day-to-day variation or within-person variability in dietary data, multiple measures of daily intake are recommended to ensure sufficient reliability (Institute of Medicine (US) Committee on Dietary Risk Assessment in the WIC Program, 2005). Another possible limitation would be the use of only two surveys out of the total available NHANES year surveys; ExWAS studies integrating additional NHANES datasets as well as a bigger number of environmental exposure variables are warranted to improve our knowledge in the environmental determinants of obesogenesis.

Adolescence is a critical life window of susceptibility to metabolic diseases, such as, overweight and obesity during which ongoing children’s development may be perturbed by a suite of environmental stressors, including lifestyle/behavioral factors and dietary habits (Schneider et al. 2017). The methodological framework of the human exposome and its tools allow for a comprehensive assessment of multiple factors with respect to disease outcomes through an agnostic, untargeted and hypothesis-generating approach. The NHANES-based discovered and replicated predictors of zBMI among U.S. adolescents seem to be in line with the global literature and further highlight their importance as potential early-stage biomarkers of excess weight. Additional studies at younger age groups are warranted to better elucidate the implication of these biomarkers in metabolic disease pathogenesis.


Ethics approval and consent to participate

The NCHS Ethics Review Board approved the survey protocols and informed consent was obtained for all subjects.

Consent for publication

No individual data that consent for publication shall be acquired is present in this manuscript

Availability of data and materials

The R code, scripts and output are made publicly available and they have been submitted on the Journal’s portal. There is supplementary visualization material based on these datasets, which can be easily customized to accommodate other NHANES surveys.

Conflict of Interest Form

All authors declare no conflict of interest for the manuscript entitled An exposome-wide association study on body mass index in adolescents using the National Health and Nutrition Examination Survey (NHANES) 2003-2004 and 2013-2014 data.


Makris K.C. acknowledges the partial funding support by the EXPOSOGAS project, H2020 research and innovation programme under grant agreement #810995.

Author contributions

Nadine Haddad, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing 

Xanthi D. Andrianou, Data curation, Formal analysis, Methodology, Writing – review & editing, 

Christa Parrish, Data curation, Methodology, Writing – review & editing,

Stavros Oikonomou, Data curation, Methodology, Writing – review & editing,

Konstantinos C. Makris, Conceptualization, Data curation, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – review & editing


  1. Bussler, S. et al. Novel Insights in the Metabolic Syndrome in Childhood and Adolescence. Horm Res Paediatr, 88, 181–193 (2017).
  2. Centers for Disease Control and Prevention (CDC). 2019. BMI for Age Training Course. Recommended BMI-for-age Cutoffs. Available: [accessed 10 April 2021].
  3. Centers for Disease Control and Prevention (CDC). 2020. Introduction. National Health and Nutrition Examination Survey. Available: [accessed 10 April 2021].
  4. Centers for Disease Control and Prevention (CDC). 2021. Module 3 - Weighting. NHANES. Available: [accessed 10 April 2021].
  5. Centers for Disease Control and Prevention (CDC). Prevalence of abnormal lipid levels among youths --- United States, 1999–2006. MMWR Morb Mortal Wkly Rep, 59, 29–33 (2010).
  6. Clément, K. & Ferré, P. Genetics and the pathophysiology of obesity. Pediatr Res, 53, 721–725 (2003).
  7. Djalalinia, S., Qorbani, M., Peykari, N. & Kelishadi, R. Health impacts of Obesity. Pakistani Journal of Medical Sciences, 31, (2015).
  8. Engeland, A., Bjørge, T., Tverdal, A. & Søgaard, A. J. Obesity in Adolescence and Adulthood and the Risk of Adult Mortality. Epidemiology, 15, 79–85 (2004).
  9. Fermin, C. R., Lee, A. M., Filipp, S. L., Gurka, M. J. & DeBoer, M. D. Serum Alanine Aminotransferase Trends and Their Relationship with Obesity and Metabolic Syndrome in United States Adolescents, 1999–2014. Metabolic Syndrome and Related Disorders, 15, 276–282 (2017).
  10. Ford, E. S., Li, C., Cook, S. & Choi, H. K. Serum Concentrations of Uric Acid and the Metabolic Syndrome Among US Children and Adolescents., 115, 2526–2532 (2007).
  11. Fryar, C. D. et al. 2020. Prevalence of Overweight, Obesity, and Severe Obesity Among Children and Adolescents Aged 2–19 Years: United States, 1963–1965 Through 2017–2018.
  12. Gangler, S. et al. 2019. Exposure to disinfection byproducts and risk of type 2 diabetes: a nested case–control study in the HUNT and Lifelines cohorts. Metabolomics 15.
  13. Goodman, A. 2020. Requiem in the Time of Pandemic.Medical Research Archives8.
  14. Gustafsson, D. & Unwin, R. The pathophysiology of hyperuricaemia and its possible relationship to cardiovascular disease, morbidity and mortality. BMC Nephrol, 14, 164 (2013).
  15. Haddad, N., Andrianou, X. D. & Makris, K. C. A Scoping Review on the Characteristics of Human Exposome Studies. Current Pollution Reports, (2019).
  16. Hruby, A. et al. Determinants and Consequences of Obesity. American Journal of Public Health, 106, 1656–1662 (2016).
  17. Hsieh, C-H. et al. Correlation between white blood cell count and metabolic syndrome in adolescence. Pediatr. Int, 49, 827–832 (2007).
  18. Institute of Medicine (US). Committee on Dietary Risk Assessment in the WIC Program. In: Dietary Risk Assessment in the WIC Program (National Academies Press (US):Washinton (DC), Washinton (DC), 2005). Dietary Risk Assessment in the WIC Program
  19. Jung, M. K. & Yoo, E-G. Hypertriglyceridemia in Obese Children and Adolescents. J Obes Metab Syndr, 27, 143–149 (2018).
  20. Kong, A. P. S. et al. Associations of uric acid and gamma-glutamyltransferase (GGT) with obesity and components of metabolic syndrome in children and adolescents. Pediatric Obesity, 8, 351–357 (2013).
  21. Kubota, M. 2019. Hyperuricemia in Children and Adolescents: Present Knowledge and Future Directions. J Nutr Metab 2019; doi:10.1155/2019/3480718.
  22. Lee, E. Y. & Yoon, K-H. Epidemic obesity in children and adolescents: risk factors and prevention. Frontiers of medicine, 12, 9 (2018).
  23. Lin, C-M. et al. Predictive Value of Serum Gamma-glutamyltranspeptidase for Future Cardiometabolic Dysregulation in Adolescents- a 10-year longitudinal study. Sci Rep, 7, (2017).
  24. Liu, Z., Que, S., Xu, J. & Peng, T. Alanine Aminotransferase-Old Biomarker and New Concept: A Review. Int J Med Sci, 11, 925–935 (2014).
  25. Loeffler, L. F., Navas-Acien, A., Brady, T. M., Miller, E. R. & Fadrowski, J. J. Uric Acid Level and Elevated Blood Pressure in U.S. Adolescents., 59, 811–817 (2012).
  26. Mason, J. E., Starke, R. D. & Kirk, J. E. V. Gamma-Glutamyl Transferase: A Novel Cardiovascular Risk BioMarker. Prev. Cardiol, 13, 36–41 (2010).
  27. McGinnis, D. P., Brownstein, J. S. & Patel, C. J. Environment-Wide Association Study of Blood Pressure in the National Health and Nutrition Examination Survey (1999–2012). Sci. Rep, 6, (2016).
  28. Ng, M., Fleming, T. & Robinson, M. Global, regional, and national prevalence of overweight and obesity in children and adults during 1980–2013: A systematic analysis for the Global Burden of Disease Study 2013 (Lancet (2014) 384 (766–781)). The Lancet, 384, 746 (2014).
  29. Obesity and overweight. Available: [accessed 11 July 2021].
  30. Ofei, F. 2005. Obesity - A Preventable Disease.Ghana Medical Journal39.
  31. Ogden, C. L. et al. Trends in Obesity Prevalence Among Children and Adolescents in the United States, 1988–1994 Through 2013–2014. JAMA, 315, 2292–2299 (2016).
  32. Ogden, C. L. & Flegal, K. M. 2010. Changes in Terminology for Childhood Overweight and Obesity. 6.
  33. Organization, W. H. Obesity and Overweight. World Health Organization. Available: [accessed 3 January 2020].
  34. Pacifico, L. et al. Serum uric acid and its association with metabolic syndrome and carotid atherosclerosis in obese children. European journal of endocrinology / European Federation of Endocrine Societies, 160, 45–52 (2008).
  35. Park, H. S., Han, J. H., Choi, K. M. & Kim, S. M. Relation between elevated serum alanine aminotransferase and metabolic syndrome in Korean adolescents. The American Journal of Clinical Nutrition, 82, 1046–1051 (2005).
  36. Patel, C. J., Bhattacharya, J. & Butte, A. J. An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus. PLoS One, 5, e10746 (2010).
  37. Patel, C. J., Manrai, A. K., Corona, E. & Kohane, I. S. Systematic correlation of environmental exposure and physiological and self-reported behaviour factors with leukocyte telomere length. Int J Epidemiol, 46, 44–56 (2017).
  38. Raatz, S. K., Conrad, Z., Johnson, L. K., Picklo, M. K. & Jahns, L. Relationship of the Reported Intakes of Fat and Fatty Acids to Body Weight in US Adults. Nutrients, 9, (2017).
  39. Reilly, J. J. & Kelly, J. Long-term impact of overweight and obesity in childhood and adolescence on morbidity and premature mortality in adulthood: systematic review. International Journal of Obesity, 35, 891–898 (2011).
  40. Reyes, M. et al. Obesity is associated with acute inflammation in a sample of adolescents. Pediatr. Diabetes, 16, 109–116 (2015).
  41. R-project. 2021. R: The R Project for Statistical Computing.
  42. RStudio. Open source & professional software for data science teams - RStudio. Available: [accessed 2 July 2021].
  43. Schneider, B. C., Dumith, S. C., Orlandi, S. P. & Formoso Assuncao, M. C. Diet and body fat in adolescence and early adulthood: a systematic review of longitudinal studies. Ciencia & Saude Coletiva, 22, 1539–1552 (2017).
  44. Singer, K. & Lumeng, C. N. The initiation of metabolic inflammation in childhood obesity. J Clin Invest, 127, 65–73 (2017).
  45. Soletsky, B. & Feig, D. I. Uric Acid Reduction Rectifies Prehypertension in Obese Adolescents., 60, 1148–1156 (2012).
  46. Speiser, P. W. et al. Childhood Obesity. The Journal of Clinical Endocrinology & Metabolism, 90, 1871–1887 (2005).
  47. Starc, G. & Strel, J. Tracking excess weight and obesity from childhood to young adulthood: a 12-year prospective cohort study in Slovenia. Public Health. Nutr, 14, 49–55 (2011).
  48. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A, 100, 9440–9445 (2003).
  49. Strauss, R. S., Barlow, S. E. & Dietz, W. H. Prevalence of abnormal serum aminotransferase values in overweight and obese adolescents. J Pediatr, 136, 727–733 (2000).
  50. The, N. S., Suchindran, C., North, K. E., Popkin, B. M. & Gordon-Larsen, P. Association of Adolescent Obesity With Risk of Severe Obesity in Adulthood. JAMA, 304, 2042–2047 (2010).
  51. Tussing-Humphreys, L. M., Liang, H., Nemeth, E., Freels, S. & Braunschweig, C. A. Excess Adiposity, Inflammation, and Iron-Deficiency in Female Adolescents. Journal of the American Dietetic Association, 109, 297–302 (2009).
  52. Uche, U. I., Suzuki, S., Fulda, K. G. & Zhou, Z. Environment-wide association study on childhood obesity in the U.S. Environ. Res, 191, 110109 (2020).
  53. Wild, C. P. Complementing the Genome with an “Exposome”: The Outstanding Challenge of Environmental Exposure Measurement in Molecular Epidemiology. Cancer Epidemiology Biomarkers & Prevention, 14, 1847–1850 (2005).
  54. Wu, C-Z. et al. Relationship between white blood cell count and components of metabolic syndrome among young adolescents. Acta Diabetol, 47, 65–71 (2010).
  55. Wulaningsih, W. et al. Investigating nutrition and lifestyle factors as determinants of abdominal obesity: an environment-wide study. Int J Obes (Lond), 41, 340–347 (2017).
  56. Zheng, Y. et al. Design and Methodology Challenges of Environment-Wide Association Studies: A Systematic Review. Environ Res, 183, 109275 (2020).