DNA Methylation and Asthma Acquisition During and Post-Adolescence, an Epigenome-Wide Longitudinal Study.

Background- While the majority of asthma starts in early childhood, asthma onset in some individuals occurs during adolescence or in adulthood. However, the pathogenesis of later onset asthma as well as the observed sex specicity are not well understood. Objective- We hypothesized that DNAm at specic CpG sites measured before disease onset, either in pre- or post-adolescence would be associated with asthma acquisition both during adolescence and in later adulthood. Methods- Subjects from the Isle of Wight Birth Cohort (IOWBC) were included. DNAm in blood at ages 10 (pre-adolescence) and 18 (post-adolescence), and asthma acquisition from age 10-18, and 18-26 years was studied. To improve statistical power, we rst screened epigenome-wide CpGs based on the association of DNAm at 10 years with asthma acquisition from 10-18 years. Logistic regression with repeated measures were then applied to the CpGs that survived screening to examine the associations of pre-adolescence DNAm with asthma acquisition from pre-to post-adolescence, and post-adolescence DNAm with asthma acquisition from post-adolescence to adulthood. The effect of DNAm on asthma acquisition at different transition period was evaluated using interaction terms. The ALSPAC birth cohort was used for independent replication. For biological assessment of identied CpGs, pathway enrichment analysis and Differentially Methylated Regions were assessed. Results- Signicant interaction effects of DNAm and transition period (10-18 or 18-26 years) on asthma acquisition were found for 17 CpGs in males and 98 CpGs in females (FDR=0.05) in IOWBC. Consistent interaction effects were observed for 9 CpGs in males and 53 CpGs in females in ALSPAC. For CpGs not showing interaction effects (i.e., effect of DNAm is stable over time), association with asthma acquisition was found for 38 CpGs in males and 52 CpGs in females in IOWBC. Of these 90 CpGs, at 13 CpG in males and 37 CpG in females, consistent direction of associations was observed in ALSPAC. Genes that the identied CpGs were mapped to, e.g., AKAP1 and ENO1, have shown to be associated with asthma. with asthma acquisition and such association is likely to be sex and transition period specic.


Introduction
Asthma is the most prevalent chronic respiratory condition 1 affecting 1-18% of population in several countries 2 . Over recent decades, childhood asthma has become a major public health issue 3 with an increasing prevalence worldwide 4 . Environmental factors such as air pollution, infectious agents, and tobacco smoke have been shown to be associated with the development of asthma 5 .
DNA methylation (DNAm), a robust and stable epigenetic mark, represents a potential mechanism for environmental impact on human diseases 6 . Recent studies suggest that DNAm signatures of cytosinephosphate-guanines (CpG) sites are associated with asthma [7][8][9] . Since peripheral blood is readily obtainable and easy to handle in laboratory processing, and information of immune cells in blood is relevant to asthma pathogenesis 10 , DNAm in peripheral blood cells has been commonly examined in epigenome-wide studies of asthma [11][12][13][14] .
While the development of asthma clearly re ects the combination of inherited susceptibility and environmental exposures, the pathogenesis and underlying biological mechanisms involved in the onset of asthma later in life are not well understood. Asthma most commonly develops during early childhood 15 , and the prevalence of asthma depends on gender and age. Asthma is more prevalent among pre-adolescent boys, while it becomes more prevalent among females after puberty with prevalence in males and females being approximately equal in adulthood [16][17][18] . However, the pathogenesis of these sex differences in asthma across adolescence and adulthood remain unclear.
Although previous studies have demonstrated association between DNAm and asthma, the role that DNAm plays in asthma acquisition, especially during the critical transition period from pre-to post-adolescence, and how its role changes over time, e.g., from adolescence to adulthood, are unknown. Findings from this type of studies will not only identify important markers for asthma acquisition, and more importantly, bene t our future efforts in asthma prediction and consequently asthma prevention. To this end, in this study, for each gender, we examined the association between DNAm at pre-adolescence and asthma acquisition from pre-to postadolescence (10 to 18 years), and between DNAm at post-adolescence and asthma acquisition from postadolescence to adulthood (18-26 years), utilizing genome-wide DNA methylation data. We hypothesized that DNAm at speci c CpG sites measured before disease onset, either in pre-or post-adolescence would be associated with asthma acquisition both during adolescence and in later adulthood and that there would be differences in such DNA methylation patterns by time window (adolescence or post-adolescence) and by gender.

Study population
The study population comprised of children born between January 1, 1989 and February 28,1990

Asthma acquisition
Questionnaires that included the questions of the International Study of Asthma and Allergy in Childhood (ISAAC) was lled by parents/participants at ages 4, 10, 18 and 26 years 8,21−23 . Asthma was de ned as "physician diagnosed asthma" and "wheezing or whistling in the chest in the last 12 months" or "current treatment for asthma." Subjects with asthma at age four years were excluded. The outcome used in this study, asthma acquisition, was de ned as individuals who were asthma free at age 10 years and recorded as having asthma at age 18 years (no→yes). The same de nition was applied for asthma acquisition from 18 to 26 years (no→yes). Subjects who did not have asthma at both the transition periods were taken as reference (no→no).

DNA methylation
Using a standardized salting procedure, DNA was extracted from peripheral whole blood samples collected at ages 10 and 18 years 24 . Fluorometric quantitation was used to estimate DNA concentration. Methylation levels at each CpG site was measured using Illumina In nium HumanMethylation450 or MethylationEPIC BeadChips (Illumina, Inc., San Diego, ca, USA). Probes that did not reach a p-value of 10 − 16 in at least 95% of samples were exclude. The same criterion was applied to exclude samples, i.e., samples with p-value > 10 − 16 in at least 95% of the CpGs. CpGs on sex chromosome were excluded.
Using CPACOR pipeline, DNA methylation (DNAm) was pre-processed for the data from both HumanMethylation450 and MethylationEPIC. DNAm intensities were quantile normalized using the min R computing package 25 . The quantile normalized intensities at autosomal probes were then converted to beta values. Principal components (PCs) inferred based on control probes were used to represent latent chip to chip and technical variation. We determined PCs based on DNAm at shared control probes of the two DNAm platforms HumanMethylation450 and MethylationEPIC. In total, 195 shared control probes were used to calculate the control probe PCs with top 15 PCs included in our study to represent latent batch factors 26 . In this study, CpG sites common between Illumina 450k platform and EPIC platform were examined. In additions, CpG sites were excluded if the minor allele frequency of a probe SNP at that site is > 0.7% (i.e., ~ ≥ 10 out of 1456 subjects expected to have the minor allele in the cohort) and the probe SNP was within 10 base pairs of the targeted CpG site. After quality control and pre-processing, 442,475 CpG sites were included in subsequent analysis.
Since whole blood is a mixture of distinct cell types 27 , there is a need to adjust for cell type composition to account for their potentially confounding effects 28 . Cell type proportions were estimated using the Bioconductor min package 29 30 . The estimated cell type proportions of CD4 + T cells, natural killer cells, neutrophil, B cells, monocytes, and eosinophil cells were included in the analyses as confounders.

Covariates
Atopic status was evaluated at ages 10 and 18 years based on results from skin prick test (SPT) on 11 common allergens (house dust mite, cat dander, dog dander, grass pollen mix, tree pollen mix, Alternaria alternate, Cladosporium herbarium, cow's milk, hen's egg, peanut, and cod). Being SPT positive to one or more of the 11 allergens was treated as being atopic. Active smoking status at 18 and 26 years was recorded as 'yes' if the participant was a current smoker at that respective age. Second-hand smoke exposure was coded at age 18and 26-years using information obtained from smoking status of parents and other smokers in the household.
To evaluate the contribution of transition periods, 10-18 and 18-26 years, to the association of DNAm with asthma transition, transition periods were included in the analyses as adjusting factors.

Statistical analyses
By regressing the M-values (base-2 logit transformed beta values of DNAm) at each CpG site on the aforementioned 15 PCs and 6 cell type proportions, we obtained cell-type and batch-adjusted DNAm (residuals) at each of the 442,475 CpG sites for each gender in IOWBC. Screening of CpG sites was done to obtain DNAm potentially associated with asthma acquisition from pre-to post-adolescence using simple linear regressions.
Here, asthma acquisition from 10 to 18 years of age was the independent variable and DNAm at age 10 years was the dependent variable. The analysis was strati ed by gender. For the screening purpose, multiple testing was adjusted by controlling false discovery rate (FDR) at a higher rate of 0.2. CpG sites that passed screening were included in subsequent analyses.
Logistic regressions with repeated measurements were applied to the CpGs that passed screening to evaluate the association of asthma acquisitions (no→yes) at two transition periods (10-18 years and 18-26 years) with DNAm at earlier ages (10 and 18 years, respectively). The use of repeated measures allowed us to achieve a higher statistical power and detect the desired effect size. This approach is especially desired in the situation of small sample sizes. Along with other covariates-atopic status at 10 and 18 years, active and second-hand smoking status at 18 and 26 years, to assess whether the associations are different at different transition periods, in addition to the main effects of DNAm, we also tested interaction effects between DNAm and transition period. For both situations (the models with main effects only, and the models that included interaction effects) multiple testing was adjusted by controlling FDR of 0.05.

Replication cohort -ALSPAC
In addition to the utilization of repeated measures in our statistical modeling, to further assess the informativity of ndings in IOWBC, an independent replication cohort, the Avon Longitudinal study of Parents and Children (ALPSAC) cohort [31][32][33] , was included to examine CpGs showing signi cant interaction effects with transition periods in IOWBC. DNAm in the ALSPAC cohort was assessed using the In nium HumanMethylation450 BeadChip. DNAm data on 604 children in the ALSPAC cohort were available at ages 7 and 17 years 34 . DNAm pre-processing was performed by correcting for batch effects using the min package 25 and removing CpGs with detection p-value ≥ 0.01. Samples were agged that contained sex-mismatch based on X-chromosome methylation. Estimated cell type proportions of CD4 + T cells, natural killer cells, neutrophil, B cells, monocytes, and granulocytes cells were used in the analyses to adjust for cell heterogeneity.
Asthma acquisition status from 7 to 17 years, and 17 to 22 years was included in the analysis. It was de ned as having no asthma at age 7 years and having asthma at age 17 years. The same de nition was applied for asthma acquisition from 17 to 22 years. Logistic regression with repeated measurements were used with similar covariates (as those in IOWBC) available in ALSPAC, i.e., atopy status at age 7 years and second-hand smoke exposure status at age 17 and 24 years. Please note that the study website contains details of all the data that is available through a fully searchable data dictionary and variable search tool (http://www.bristol.ac.uk/alspac/researchers/our-data/).

Detection of differentially methylated regions (DMR)
Differentially methylated regions (DMRs) were identi ed using the DMRcate package in R 35 . To secure a su cient number of CpGs for DMR enrichment analysis and to avoid missing important DMRs, CpGs with DNAm (in M values) associated with asthma acquisition via logistic regression at FDR of 0.4 were included in the analysis.

Pathway enrichment analyses
For CpGs showing associations of DNAm with asthma acquisition status, the genes annotated to the CpGs were summarized along with information such as gene location, chromosome number based on Illumina's manifest le and USCS genome browser (https://genome.ucsc.edu). Pathway enrichment analysis of the identi ed CpGs was conducted using the gometh function 36 in the R package to better understand their biological functionality.

Results
Since our study focused on asthma acquisition starting from age 10 years in IOWBC, subjects with asthma at four years were excluded. Participants in IOWBC with both asthma transition and DNAm data available at ages 10 and 18 years were included in the study. The subsamples represented the complete IOWBC (such that no asthma at age 4 years) with respect to asthma acquisition, active and second-hand smoking, and atopy status, (Table 1a & 1b).
In total, 55 CpGs for males and 183 CpGs for females in IOWBC passed screening based on their potential associations with asthma acquisition from 10 to 18 years of age. These CpGs were included in subsequent analyses for their longitudinal associations of DNAm with asthma acquisition from pre to post-adolescence and from post-adolescence to young adulthood and for interaction effects between DNAm and transition periods, using logistic regressions with repeated measurements. The utilization of logistic regressions with repeated measures allowed us to identify effect sizes with a higher statistical power even if the number of "cases"  (Supplemental table 2a and  2b) between the two cohorts. The owchart of the study along with brief summaries of results is in Figure 4.
Pathway enrichment analyses was conducted based on IOWBC-discovered CpGs for each sex (55 in males and 150 in females with 205 CpGs in total) to better understand their biological functionality. These CpGs were mapped to 54 and 149 genes in males and females respectively. Using these CpGs in the gometh function in R, we identi ed 212 biological processes in males and 228 in females that were enriched at p-value of 0.05.
Although none of the biological processes survived multiple testing at FDR of 0.05, genes involved in the top processes for each sex based on statistical signi cance (top 10 processes in Table 2) were potentially important and may deserve a further assessment. For males, multiple biological processes among the top 10 for males focus on catabolic processes (breakdown glucose for energy), while for females they are biosynthetic processes (synthesizing glucose from food). Among these top processes identi ed for males, 13 genes corresponding to the identi ed CpGs were involved in those processes, and for females, 60 genes were involved (Supplemental   table 3). Of the 13 genes, CpGs on ve (~39%) genes showing consistent associations (interaction or main effects) between IOWBC and ALSPAC, and of the 60 genes, 34 (~57%) genes shown such consistency between the two cohorts.
For DMR enrichment analysis, we used CpGs in the screening process that were statistically signi cant at FDR of 0.4 to cover epigenetic information comprehensively on asthma acquisition. In total, 427 CpGs in males and 372 CpGs in females were included in the analysis. We identi ed three DMRs in males and three DMRs in females (Table 3).

Discussion
We assessed the longitudinal association of DNAm measured at earlier ages with asthma acquisition at later ages for each sex based on data in two independent cohorts with IOWBC as the discovery cohort and ALSPAC as the replication cohort. In the IOWBC, at 205 CpGs, pre-adolescence DNAm was shown to be associated with the odds of asthma acquisition from pre-to post-adolescence, and post-adolescence DNAm was associated with asthma acquisition from post-adolescence to adulthood. At 112 of these 205 CpGs (54.6%), consistent associations were observed in the ALSPAC cohort, including statistically signi cant ndings at 7 CpGs. These 112 CpGs included 62 CpGs (9 in males) showing transition-speci c associations with asthma acquisition in that the association of DNAm with asthma acquisition at these 62 CpGs was different between the pre-to postadolescence transition period and the post-to adulthood transition period.
Our ndings also indicated signi cant differences between males and females. For the 62 CpGs showing consistent transition-speci c effects between the two cohorts, at most of the CpGs in males, we found that an increase in DNAm was associated with an increased odds of asthma acquisition during the period from pre-to post-adolescence transition, while for the next transition period, at most of the CpGs, increased DNAm was associated with decreased odds. However, in females, at most of the CpGs, the associations were opposite compared to those in males; in females, an increase in DNAm was shown to be associated with a decreased odds of asthma acquisition from pre-to post-adolescence at most CpGs, but with increased odds at most of the CpGs in the transition period from post-adolescence to adulthood. Among the 50 CpGs (13 in males) showing main effects on asthma acquisition, although at most of the CpGs, an increase in DNAm was associated with a decreased odds of asthma acquisition for both males and females, the proportion of such CpGs was larger in females than in males. Furthermore, the effect sizes were overall weaker in females than in males. Before adolescence, asthma is more prevalent in males but during adolescence, more females acquire asthma and the prevalence of asthma in females surpasses that of males. The unique CpGs identi ed for each sex without any overlap and the inconsistent associations of DNAm with asthma acquisition between males and females seemed to be related to the gender-reversal phenomenon of asthma prevalence from pre-to post-adolescence.
Although we did not identify statistically signi cant biological processes after adjusting for multiple testing, biological processes involved in host immune function related to IL7 (i.e., interleukin-7-mediated signaling pathway, response to interleukin-7, cellular response to interleukin-7) were among the top processes determined based on statistical signi cance. IRS1 gene was involved in the processes related to IL-7 and its mapped CpG (cg11620807) showed consistent association between the two cohorts. IL7 signaling has been suggested to promote immunopathogenesis of asthma 37,38 , indicating the potentially informativity of the identi ed CpGs on asthma acquisition. In addition, the TMEM194A gene identi ed based on DMR analyses in males has been previously shown to be associated with asthma in GWAS catalog 39 . For females, gene SERPINE2 in one of the identi ed DMRs has been connected with asthma based on genetic studies 40 .
The gene AKAP1, mapped to cg02467794, showing consistent and statistically signi cant interaction effects in both cohorts in females, has been showed to be associated with asthma in the Agricultural Lung Health Study 41 . Although there was no overlap in identi ed CpGs between males and females, the gene ENO1 was among the mapped genes of the IOWBC-discovered CpGs in both sexes. The detection of IgG autoantibodies to alpha-enolase has been shown to be the most signi cant indicator for distinguishing severe asthma from mildto-moderate asthma (OR = 5.2, 95% CI = 2.1-12.9, p-value < .001). It has been shown that alpha-enolase, an autoantigen, was associated with severe asthma 42 . The connection of gene ENO1 with asthma acquisition shown in our study is consistent with its differentiation between severe and mild-to-moderate asthma. Further assessment on CpGs located on this gene is likely to bene t the potential of the CpGs as epigenetic markers for asthma acquisition.
The strength of this study exists in its focus on longitudinal assessment of asthma acquisition at two important transition periods, pre-to post-adolescence and to young adulthood, along with DNAm at two critical time points, pre-and post-adolescence. To our knowledge, this is the rst study to examine the epigenetics of asthma acquisition from pre-to post-adolescence, and post-adolescence to young adulthood with respect to gender and transition period speci city.
Although for the CpGs discovered in IOWBC, more than 50% showed consistent ndings in ALSPAC, statistical signi cance was observed at a small number of CpG sites. One reason for this lack of signi cance might be the age differences between the two cohorts. In addition, there is a potential concern of data double dipping. However, we do not see this as a signi cant concern in that the statistical model applied in the screening process (linear regression without covariates such as atopic and smoking status) was different from the model in the nal analyses (logistic regression with potential covariates). We also noticed that the number of asthma acquisitions at each age was relatively small. However, our utilization of repeated measures in regression analyses had a potential to ease this concern, and the inclusion of the ALSPAC replication cohort to further examine the IOWBC-ndings with focus on consistent direction of associations further relieved our worries to a certain extent. Another potential limitation is in the design of data analyses, which focused on each individual CpG site. However, CpG sites might be correlated and work jointly to impact the risk of asthma acquisition, which certainly deserves future investigations accompanied by carefully designed analytical plans. Finally, both cohorts, although independent, are mainly Caucasians. Thus, our ndings are likely limited to only this population. Nevertheless, the identi ed CpGs based on two independent cohorts have a potential to guide future studies in asthma acquisition prediction at different transition periods.  The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request. For the ALSPAC data, please contact the ALSPAC executive committee (alspac-exec@bristol.ac.uk).

Conflicts of interest
The authors declare that they have no conflicts of interest.   showing the main effects of DNAm on asthma acquisition at each of the 52 identi ed CpGs in IOWBC in females. Gene names corresponding to each CpG site are also labeled on the X-axis.