DNA methylation in blood - potential to provide new insights in immune cell biology

Epigenetics plays a fundamental role in cellular development and differentiation; epigenetic mechanisms, such as DNA methylation, are involved in gene regulation and the exquisite nuance of expression changes seen in the journey from pluripotency to nal differentiation. Thus, DNA methylation has the potential to reveal new insights in to immune cell biology. We mined publicly available DNA methylation data with a machine-learning approach to identify differentially methylated loci between different white blood cell types. We then interrogated the DNA methylation and mRNA expression of candidate loci in CD4+, CD8+, CD14+, CD19+ and CD56+ fractions from 12 additional, independent healthy individuals (6 male, 6 female). ‘Classic’ immune cell markers such as CD8 and CD19 showed expected methylation/expression associations tting with established dogma that hypermethylation is associated with the repression of gene expression. We also observed large differential methylation at loci which are not considered established immune cell markers, and some of these novel loci showed inverse correlations between methylation and mRNA expression (such as PARK2,DCP2). Our results highlight the value of mining publicly available data, the utility of DNA methylation as a discriminatory marker and the potential value of DNA methylation to provide additional insights into immune cell biology and developmental processes.


Background
Epigenetics refers to the heritable, but reversible, regulation of various genomic functions, including gene expression. It provides mechanisms whereby an organism can dynamically respond to changes in its environment and "reset" gene expression accordingly [1]. Furthermore, these mechanisms play a critical role in development and cell lineage speci city [2]; [3], as highlighted recently when epigenomic pro ling revealed a linear differentiation model for memory Tcells [4]. One such epigenetic mechanism is DNA methylation. Methylation of the cytosine nucleotide within CpG dinucleotides in DNA is well documented in humans [5]; [6]. DNA methylation can be developmentally 'hard-wired' (as in the case of imprinting [7]), underpin cell identity (i.e. cell markers of differentiation [8]; [6]) or dynamic and change in response to environmental factors [9]. Therefore, the investigation of an individual's methylation pattern can reveal a lifetime record of environmental exposures as well as potential disease speci c marks [10]; [11].
It is well established that epigenetics contributes signi cantly to the developmental fate of cells and tissues [8]. For instance, epigenetic mechanisms contribute to the differentiation of hematopoietic stem cells from bone marrow [12]; [13]. Importantly, DNA methylation appears to play a crucial role at speci c stages along the separation of blood cell lineages (myeloid, lymphoid) and contributes to the establishment and functionality of the nal differentiated cell type [14]. Epigenetic marks, including DNA methylation, are increasingly recognised as potential discriminators of cell type [15]. This observation highlights the potential of DNA methylation analyses to uncover 'hidden' biology and, in the case of immunology, to identify previously unrecognised loci that could act as immune cell discriminators in additional to those cell surface markers currently used routinely. Furthermore, such analyses could identify previously unrecognised immune cell populations/sub-types.
DNA methylation as an epigenetic mark is easily quanti ed and evaluated from blood. Many recent studies using Illumina array technology have made their data publicly available, providing an excellent resource for hypothesis generation and testing in silica prior to wet-lab experimentation. We hypothesised that because of its role in differentiation and development DNA methylation could provide new insights into loci that discriminate between immune cell types; these role of these loci in cell discrimination might be previously unrecognised and/or could be harnessed to sort and/or identify potential new cell sub-types. Therefore, we initially performed an in silico experiment using data from a study which examined the DNA methylation pro le of human white blood cell populations [16]. Reinius et al., investigated DNA methylation in: T cells (CD8 + , CD4 + ); B cells (CD19 + ); natural killer cells (NK cells; CD56 + ); monocytes (CD14 + ); granulocytes (Gran; both CD16 + and Siglec8 + cells); neutrophils (Neu, CD16 + ), and eosinophils (Eos, Siglec-8 + ). The authors identify DNA methylation marks in "classic" immune cell marker loci. Here, we detail an unsupervised analysis approach which, as anticipated, identi es discriminatory DNA methylation marks in 'classic' immune cell markers, but also highlights signi cant differential methylation in "non-classic" immune markers, and genes for which a role in immune function is yet to be reported.

DNA methylation -discovery
We identi ed DNA methylation at 1173 CpG sites which clearly differentiated speci c immune cell populations using publicly available data from whole blood [16]; hierarchical clustering and principal components analyses provide a visual presentation and highlight that these markers cluster the cell populations in a biologically meaningful way ( Figure 1). Pathway analyses of the genes to which these 1173 CpG sites mapped strongly supported their discriminatory nature, and, as expected, enrichment for immune cell biological function was observed: enrichment for CD56 (> 79 genes), CD4 (> 68 genes), CD8 (> 34 genes), CD14 (> 69 genes) and CD19 (> 194 genes) was observed. Furthermore, these results suggest that discriminatory CpG marker loci may map to genes with a hitherto unrecognised role in immune cell discrimination and/or function.

RNA expression
Given the role that DNA methylation plays in regulation of gene expression we also explored the mRNA levels of the 11 loci from our DNA methylation validation experiment. We investigated gene expression by QRTPCR in the 12 independent samples. A clear differentiation between immune cells types at the gene expression level was observed for PARK2, POU2F2, DCP2, CD248, CD8A, SLC15A4, CD4A0LG and CD19 but not for FAR1, WIPI2, KLRB1 ( Figure 2) .

Discussion
DNA methylation is exquisitely placed to re ect a cell's differentiation trajectory. Using publicly available data we identi ed 1173 unique CpG sites at which DNA methylation discriminated CD8 + , CD4 + , CD19 + , CD56 + , and CD14 + cell populations as well as granulocytes, neutrophils, and eosinophils. DNA methylation at two discriminatory CpG loci for each of CD8 + , CD4 + , CD19 + , CD56 + , and CD14 + was validated in 12 independent samples.
The majority of the 1173 discriminatory CpG sites mapped to annotated loci, and gene regulatory regions in particular. This suggests that, as expected, DNA methylation is playing a key role in immune cell differentiation and cell-type identi cation. An important implication of this is that DNA methylation can be utilised to reveal previously unidenti ed immune cell sub-populations. A good example of this is the transcription factor Foxp3 which plays a key role in the development and function of Treg cells [26]; originally FOXP3 expression was used to identify Treg cells until it was deemed insu cient for the robust identi cation of suppressive Treg cells [27]; [28]. However, recent work has reported that hypomethylated CpG sites in four regions of FOXP3, CAMTA1 and FUT7 can be used to distinguish subsets of Tregs from non-regulatory CD4 + T cells [29]. These ndings strongly support our view that DNA methylation, and thus loci identi ed in our study, could be used to inform similar experiments and reveal other drivers of speci c immune cell subtypes.
Furthermore, large differences in DNA methylation were observed, and validated, at CpG loci in genes which, while their potential role in immune cell biology has been reported, have not previously been recognised as differentiators of immune cell type, such as WIPI2 [25] for CD19 + , SLC15A4 [30]; [31]; [32] for CD56 + and PARK2 [33]; [34] for CD14 + cells. We also identi ed POUF2/OCT2 for which a role as a B-cell differentiator was only recently reported [14]. In addition, signi cant cell type speci c changes in DNA methylation were observed, and validated, in genes which, to the best of our knowledge, have no previous reported role in immune biology (FAR1, CARS2). Taken together this highlights the signi cant potential of such analyses to uncover new facets of immunology. Many more additional loci from our in silica analyses showed large differences in DNA methylation, and these warrant further investigation with respect to their roles in immune cell function.
Expression analysis of the genes to which the 11 validated DNA methylation discriminatory loci mapped also revealed discrimination at the mRNA level for CD248, and CD8A (CD8+), POU2F2 and CD19 (CD19+), PARK2 (CD14+), DCP2 (CD14+), SLC15A4 (CD56+), and CD40LG (CD4+). There were three genes (FAR1, WIPI2, KLRB1) for which this was not observed. One potential explanations is the presence of multiple isoforms per gene, such that the primer/probe combination for the QRTPCR analysis did not target the correct isoform. This possibility warrants further investigation especially given the increasing body of evidence that DNA methylation is an important modulator of alternative splicing [35]; [36]; [37].

Conclusions
In summary, this study highlights the value of mining publicly available data, the utility of DNA methylation as a discriminatory marker, the potential value of DNA methylation to provide additional insights into immune cell biology and developmental processes, and the tantalising possibility that DNA methylation can be harnessed to reveal currently unrecognised/undistinguishable immune cell sub-types.

Methods
Samples: Ethics was obtained from, and all experimental protocols were approved by, The Health and Disability Ethics Committee NZ (HDEC, 15/NTB/153). All methods were carried out in accordance with relevant guidelines and regulations. Written, informed consent was obtained from all participants who were all over 18 years of age at the time of collection. Blood from 12 healthy individuals (n=6 male, n=6 female), was collected into sterile K2 EDTA vacutainers (BD Biosciences), and the buffy coat isolated. Peripheral blood mononuclear cells (PBMCs) were Fc receptor blocked, labelled with uorescent antibodies speci c for: CD3 (OKT3), CD4 (OKT4), CD8 (HIT8a), CD14 (HCD14), CD19 (HIB19) and CD56 (HCD56; all antibodies were from Biolegend) and dead cells were identi ed by DAPI exclusion. CD4 + , CD8 + , CD14 + , CD19 + and CD56 + fractions were collected (In ux cell sorter, BD Biosciences) directly into ice-cold FACS buffer, immediately frozen on dry ice and stored at -80°C.
DNA and RNA extraction.
Both nucleic acids were extracted simultaneously using a Qiagen All prep DNA/RNA kit as per the manufacturers protocol. High quality genomic DNA and RNA were obtained, with RNA RIN ≥ 7.5.

DNA methylation Analysis
Public Data. Publicly available methylation data was obtained from MARMAL-AID [17].
GLMnet penalised ridge-regression mixed with lasso in an elastic-net framework was used as implemented via the R package glmnet [18] to explore methylation association between each of the cell-types (CD8+, CD4+, CD19+, CD14+, CD56+, Neutrophils, Eosinophils, Granulocytes, as well as combinations of cell populations, PBMC and whole blood). The number of variables (~450,000 CpG sites) far outweighs the number of cell-types, as such it is accepted that conventional statistical analysis procedures that test each CpG within an independent regression model suffer from multiple testing burden and reduced statistical power. To overcome this issue we choose to use the penalised regression procedures of GLMNet, which tests all markers simultaneously, i.e. in a single regression model. GLMNet was speci cally designed to overcome issues of large variable number (k) and small sample size (n) and has been successfully applied to several genome-wide association studies of SNPs [19]; [20]; [21] and recently methylation [22]. We have previously developed and report on this method in detail to identify aging associated DNA methylation loci [23]. The FactoMineR package [24] was used for PCA analysis. All analyses were performed in R 3.5.2.
Pyrosequence analysis: DNA methylation pyrosequencing was designed and performed by EpigenDX (USA), who were provided with the Illumina probe information.
Gene Expression Analysis: 150ng total RNA was reverse transcribed using VILO Superscript (Thermo Fischer    Methylation and gene expression heatmaps for all 11 genes investigated. Expression and methylation measures were split into quartiles and their levels coloured accordingly.