We applied six clocks, including one we developed, to HUVEC-derived DNA methylation data collected from a racially and ethnically diverse sample of newborns. Only our clock was highly correlated with clinical gestational age; Horvath’s clock, a cornerstone of epigenetic aging research, in contrast, was not significantly associated with clinical gestational age in newborns (Fig. 1). Because of limit of sample size, we cannot build an epigenetic gestational age clock from an independent dataset first and then validate the clock on our data. The correlation coefficient from our clock shown in Fig. 1 should be overestimated. Knight’s clock was built on a dataset with 207 newborns. The correlation is 0.99 in the training dataset and 0.91 (8% lower) in the testing dataset [6]. EPIC GA clock was built on a dataset of 755 newborns. The correlation is 0.84 in the training set and 0.71 (15% lower) in the testing set from another country that might explained the large difference [8]. Our clock was trained from a data set of 336 newborns. The correlation is 0.85 in the training set. Considering the high diversity of our data, we can still get a correlation of 0.6 if correlation is 30% lower (doubled of EPIC GA clock) in the testing set. The correlation is still much higher than the other clocks.
One reason clocks may perform poorly in newborns is that they are often trained on data from children (PedBE) or adults (Horvath’s clock). However, Knight’s clock, Bohlin’s clock, and the EPIC GA clock were each developed using newborn training datasets; nonetheless, epigenetic age estimated from these three clocks still demonstrated much lower correlations with clinical gestational age in our dataset than in the original investigators’ own testing datasets [6–8]. There are at least two possible reasons for this. First, the three existing neonatal clocks were built with methylation data derived from cord blood or blood spots, whereas DNA samples in our study were derived from primary HUVECs using a validated methodology. Epigenetic clocks trained using data from one cell type may have poorer performance in other cell types [4, 15, 16]. Second, the racial/ethnic composition of the training datasets may also play a role. Only 17% of training samples were from Black participants for Knight’s clock and both Bohlin’s clock and the EPIC GA clock were based on The Norwegian Mother and Child Cohort Study (MoBa), which is primarily comprised of White participants. Our sample was more racially and ethnically heterogeneous: 35.4% identified as non-Hispanic White, 38.7% identified as non-Hispanic Black and 24.7% identified as Hispanic (Table 1). Social experiences that covary with racial/ethnic identity may be associated with methylation profiles and estimates of epigenetic age [2, 17, 18]. In our own data, we saw methylation differences between infants born to mothers who identified as Black as compared those who did not, suggesting the possibility that racialized experiences may underlie the observed associations (Supplementary Fig. 2).
In contrast to the Bohlin, Knight and EPIC GA clocks, the PedBE clock was developed from samples collected from individuals 0 to 20 years old. Nonetheless, the correlation between PedBE and our clock was similar to the correlation between the other three newborn clocks and our clock. This might be explained by differences in tissue type. PedBE was constructed from buccal cells, which, like the HUVECs used to generate our data, are a type of epithelial cell.
Knight’s clock, Bohlin’s clock, and EPIC GA clock were constructed from majority European ancestry populations, therefore, they may be expected to have poorer performance in more racially and ethnically diverse populations. Although the racial/ethnicity composition of the training dataset for the PedBE is not published, the correlations we observed suggest that the training data may have from a more diverse population than the Knight, Bohlin and EPIC GA clocks (Supplementary Table 2).
We also found that EGAA was associated with birthweight and certain maternal demographic characteristics. Specifically, the mean increase in EGAA was 0.11 (95% CI: 0.04–0.19) weeks per kg at birth. This estimate is similar to Bright et al.’s estimation [19]. We did not see significant associations with maternal social adversity in this study, however. Social adversity scores varied by cohort and race/ethnicity (Supplementary Fig. 3); thus, we may need a larger cohort to see the association in a stratified analysis.
Consistent with prior studies, estimates of EGAA from our clock varied by racial/ethnic identity. Horvath proposed intrinsic epigenetic age acceleration (IEAA) that “measures ‘pure’ epigenetic aging effects that are not confounded by differences in blood cell counts” [2]. HUVECs are pure endothelial cells, so EGAA estimated from HUVECs should be similar to IEAA without confounding of differences in cell counts of the contribution cells. Indeed, prior evidence has found that Hispanic adults have lower intrinsic epigenetic aging than White adults. This is consistent with our findings in newborns (Fig. 3E). We also found that newborns identified as Black had lower EGAA, whereas Horvath found that individuals with African ancestry did not have lower IEAA. Newborns identified as Black were found to have lower birth weights even at the same clinical gestational age [20]. Birth weight of newborns identified as Black were significantly lower than those identified as Hispanic or White in our data (Supplementary Fig. 4). Lower EGAA among newborns identified as Black might be related to lower birth weights since we observed that lower birth weight was associated with lower EGAA (Fig. 3D). These race/ethnic differences, which may reflect differences in social experiences, including racialized experiences, were also found in our subcohorts, one of which was predominantly non-Hispanic Black, while the other was predominantly White (Table 1, Fig. 3A). Mothers who identified as Hispanic were more likely to have a pregnancy complication (54.2%) than those who identified as non-Hispanic (28.9%) in our data. Newborns who were transferred to NICU for additional care after birth were also more likely to have lower birth weights (Supplementary Fig. 5). These six significant variables were highly correlated. To control for the confounding effects of these variables, we used a multivariable linear regression between EGAA and the significant variables found in the unadjusted analysis. Four out of the six variables were significantly associated with EGAA (all except presence of a pregnancy complication and Non-Hispanic Black racial/ethnic identity) (Supplementary Table 4).
Some variables that capture pregnancy and neonatal health risks like NICU admission and pregnancy complications were associated with lower EGAA. Complications, NICU admission, and low birth weight have previously been associated with higher risk of developmental delay [21–23], while high birth weight has been associated with earlier onset of puberty [24]. Unlike the assumption that a lower epigenetic age in adults equates to better health, epigenetic age in pediatric samples may be ideal when concordant with chronological age – neither fast nor slow epigenetic aging is likely to be beneficial during early development. Thus, epigenetic gestational age could provide insight into the developmental stage of newborns.
The results of this study should be considered in light of several limitations. First, the population in this study was heterogeneous, including two cohorts and multiple race/ethnicity groups. Although this enhances external validity, it will be important to validate our findings in other samples. Second, our sample size was small, especially for preterm newborns, so may have been underpowered to detect a relationship between EGAA and social adversity. Third, while we addressed confounding analytically, residual and unmeasured confounding are still potential issues. Fourth, while our cell type, HUVEC, was novel and extends prior literature, we did not examine the utility of the clock in other cell populations. Finally, we did not examine clinical or behavioral outcomes beyond the neonatal period, which limits insight into whether EGAA is adaptative or non-adaptive.