Analysis of Systemic Epigenetic Alterations in Inflammatory Bowel Disease: Defining Geographical, Genetic and Immune-Inflammatory influences on the Circulating Methylome

Abstract Background Epigenetic alterations may provide valuable insights into gene–environment interactions in the pathogenesis of inflammatory bowel disease [IBD]. Methods Genome-wide methylation was measured from peripheral blood using the Illumina 450k platform in a case-control study in an inception cohort (295 controls, 154 Crohn’s disease [CD], 161 ulcerative colitis [UC], 28 IBD unclassified [IBD-U)] with covariates of age, sex and cell counts, deconvoluted by the Houseman method. Genotyping was performed using Illumina HumanOmniExpressExome-8 BeadChips and gene expression using the Ion AmpliSeq Human Gene Expression Core Panel. Treatment escalation was characterized by the need for biological agents or surgery after initial disease remission. Results A total of 137 differentially methylated positions [DMPs] were identified in IBD, including VMP1/MIR21 [p = 9.11 × 10−15] and RPS6KA2 [6.43 × 10−13], with consistency seen across Scandinavia and the UK. Dysregulated loci demonstrate strong genetic influence, notably VMP1 [p = 1.53 × 10−15]. Age acceleration is seen in IBD [coefficient 0.94, p < 2.2 × 10−16]. Several immuno-active genes demonstrated highly significant correlations between methylation and gene expression in IBD, in particular OSM: IBD r = −0.32, p = 3.64 × 10−7 vs non-IBD r = −0.14, p = 0.77]. Multi-omic integration of the methylome, genome and transcriptome also implicated specific pathways that associate with immune activation, response and regulation at disease inception. At follow-up, a signature of three DMPs [TAP1, TESPA1, RPTOR] were associated with treatment escalation to biological agents or surgery (hazard ratio of 5.19 [CI: 2.14–12.56], logrank p = 9.70 × 10−4). Conclusion These data demonstrate consistent epigenetic alterations at diagnosis in European patients with IBD, providing insights into the pathogenetic importance and translational potential of epigenetic mapping in complex disease.


Introduction
In ammatory Bowel Diseases (IBD) phenotypically classi ed in two main entities, Crohn's disease (CD) and Ulcerative Colitis (UC) represent an important public health concern, with an projected prevalence of 1% in Western populations by 2030. There are signi cant implications for healthcare planning as costs, particularly of new treatments, increase 1 . These considerations have accelerated global efforts in better understanding the aetiology of IBD. Unequivocal data now implicate the interaction of host susceptibility with the exposome in the development of IBD 2 . These data have inevitably stimulated studies to explore the potential importance of epigenetic mechanisms, including DNA methylation, in pathogenesis. DNA methylation may regulate gene expression through its effect on the chromatin state as well as the accessibility of the transcription binding sites [3][4][5] and can be in uenced by many pertinent environmental factors including smoking and age 6,7 .
We have previously identi ed several alterations in the IBD-associated circulating epigenome. Our initial study in the high-prevalence Scottish population characterised a replicable pattern of DNA alterations in children with CD, with highly signi cant enrichment of methylation changes around GWAS single nucleotide polymorphisms, in particular the HLA region and VMP1/MIR21 8 . More recently, our adult epigenome-wide study within the Scottish population identi ed distinct differential methylation across key IBD genes 9 . Certain signals were highly cell-speci c; RPS6KA2 was shown to be a CD14 + monocyte speci c signal in IBD 9 . These top differentially expressed methylation signals were independently con rmed and replicated in a treatment-naïve paediatric CD cohort in North America (RISK consortium) 10 , providing strong stimulus for further research in this eld. Importantly, the extent to which the ndings can be generalised to other populations is largely unknown. Furthermore, the timing, stability, and functional importance of epigenetic alterations in affecting gene transcription have not been fully investigated.
With the advancing therapeutic repertoire in IBD, there is keen interest in risk stratifying patients at diagnosis in order to allow personalised medicine, with recent optimism that genetic, transcriptional, glycomic or serological markers may predict disease course, and help position new therapies [11][12][13][14][15][16][17][18] . Promising data have emerged that epigenetic alterations are helpful in other diseases, notably colorectal cancer. The potential clinical utility of these epigenetic marks as biomarkers in immune-mediated diseases is yet to be determined.
In our multi-centre study, we aim to extend current understanding of the pathogenetic and translational importance of the circulating epigenome in IBD. We rstly aim to assess the consistency of DNA methylation alterations in IBD across geographically distinct populations in UK, Scandinavia and Spain. We investigate environmental modi ers of the methylome including in ammation, smoking and aging. Furthermore, we de ne the genetic contribution to these alterations, their association with gene expression and de ne the methylome of disease progression in IBD. Whilst focussed on IBD, the ndings have potential implications beyond this disease eld, to other complex and immune-mediated diseases.

Study demographics
The IBD Character cohort represents a multi-centre inception cohort in which 247 (72%) of the 343 IBD patients were treatment naïve at recruitment. Genome-wide methylation was measured in 638 DNA samples extracted from peripheral blood (295 controls, 154 CD, 161 UC and 28 IBD-unclassi ed (IBD-U) patients). Table 1 summarises study recruitment and patient demographics. The mean age in patients with IBD (n=343) was 34 years (range 7-79), and 33 years (range 3-79) in controls (n= 295). A total of 27% of CD patients had a colonic disease phenotype at recruitment while 42% of patients with UC exhibited extensive colitis. Two hundred individuals were recruited in the UK, 367 in Scandinavia and 66 in Spain at presentation for investigation of suspected IBD. A total of 33% of those recruited in the UK had con rmed IBD after investigation , while 58% and 39% had IBD in the Scandinavian and Spanish cohort respectively (Table 1).

Differentially methylated probes in In ammatory Bowel Disease
Across the entire cohort, 137 probes exhibited Holm signi cant IBD-associated methylation differences in comparing IBD with controls (Supplementary Table 1). These include probes mapping at the loci VMP1/MIR21 (p=9.11×10 -15 ), SBNO2 (2.70×10 -14 ), RPS6KA2 (6.43×10 -13 ), and TNFSF10 (7.72×10 -8 ), thereby replicating and validating our previous ndings 9 . Novel ndings include differential methylation of PHOSPHO1 (3.43×10 -9 ) and SELPLG (2.54×10 -7 ). Table 2 summarises the top differentially methylated positions (DMPs). Similar analyses were performed to identify UC and CD speci c DMPs. There were 72 DMPs that differentiated CD from controls (Supplementary Table 2) and 67 DMPs that differentiated UC from controls (Supplementary Table 3).There were 24 DMPs that demonstrated overlap across UC and CD analyses (Supplementary Figure 1). There were no probes that differentiated UC from CD. Analysis was performed of Differentially Methylated Regions (DMRs), de ned as regions with ≥ 3 contiguous probes with FDR corrected p<0.05 within a distance threshold based on expected probe density. The VMP1 locus on chromosome 17 was the only DMR identi ed within our dataset that remained signi cant using these criteria.
Next, we compared our dataset with the previous ndings from our group on DMPs in adult and paediatric IBD by correlation analyses of the top 1000 methylation probe beta values in the 3 cohorts (Supplementary Figure 2) 8, 9 . Strong correlation was seen across populations (adult BIOM cohort 9 r 0.96, p< 2.20×10 -308 ; paediatric cohort r 0.81, p<2.20×10 -308 ).

Consistency of Differentially methylated positions (DMPs) across Northern Europe
We then analysed the consistency of methylation across Europe ( Figure 1) by splitting our cohort based on geographic area (Scandinavia vs UK vs Spain). Independent DMP analysis using all probes was performed in Scandinavia, identifying 34 probes that differentiated IBD from controls with a Holm p<0.05 (Table 3). These included SBNO2 (p= 2.88×10 -9 ), VMP1 (p=6.89×10 -8 ) and RPS6KA2 (p=1.57×10 -3 ). A total of 26 of these probes were signi cant in the UK cohort (n=200). In the Spanish cohort (n=66) only one (RPS6KA2) remained signi cant (Holm p=0.03). Power calculations were performed in view of these ndings, taking into consideration the effect sizes noted in the Scandinavian cohort. This post-hoc analysis con rmed that the Spanish cohort was adequately powered to detect signi cant differences in 11 of the 34 DMPs identi ed in the Scandinavian dataset (power=0.8, alpha=0.05, effect size cut-off=0.8; Supplementary The association of DNA methylation with in ammation In order to better understand the in uence of in ammation on the top differentially methylated probes, 3 distinct analyses were performed. These included correlations with in ammatory markers, DMP analysis using in ammatory markers such as hsCRP as covariates and DMP overlap with published in ammation associated methylation probes. We investigated the correlation of the 137 differentially expressed DMPs in IBD with in ammatory markers, i.e. high sensitivity c-reactive protein (hsCRP) and albumin, in individuals with complete data (n=591; Supplementary Figure 4 Table 6). In order to adjust for in ammatory activity, we also performed DMP analysis with previously described covariates and with the addition of hsCRP. A total of 59 probes remained signi cant and included SBNO2(p=5.26×10 -14 ), RPS6KA2 (p=4.51×10 -11 ) and VMP1 (p=5.36×10 -10 ) (Supplementary Table 7).
We then compared our top DMPs to those that have been reported to be associated with CRP levels 21 . A total of 43 of the 218 DMPs identi ed in the Lighthart et al were differentially methylated in our study. There were however 94 probes that still associated with IBD and were independent of in ammation associated probes (Supplementary Table 8). These included CLU (cg16292768), SBNO2 (cg12170787) and VMP1(cg02782634).
Finally, in order to identify in ammation independent DMPs, we excluded probes that correlated with hsCRP or reported in ammation-associated DMPs and identi ed 30 DMPs that showed no overlap with published in ammation associated DMPs or any correlation with in ammation (Supplementary Table 9). p<2.20x10 -16 ). Differences were also seen between IBD subtypes compared to non-IBD (vs. UC: 5.08 years Germline variations show a strong correlation with DNA methylation (meQTLs) Using paired genetic and methylation data for the entire cohort (n=638) and age, sex as covariates, meQTLs were generated using the top DMPs using the entire cohort (DMPs n=137). A total of 2991 cis-meQTLs were identi ed. After applying a MAF >0.05 and holm adjustment, 341 cis-meQTLs remained signi cant across 21 unique genes, indicating a strong genetic in uence on methylation. Several key loci that were signi cantly differentially methylated in IBD had a strong genetic in uence including ITGB2 (7 cis-meQTLs; top p=2.83×10 -16 ), and 143 VMP1/MIR21 cis-meQTLs across 6 probes (Supplementary Table 12). This includes meQTLs with a known GWAS single nucleotide polymorphism (snp) rs1292053 and also with its previously reported LD snp ( rs8078424, r 2 =0.43, top p=1.48×10 -20 ) 9 . Other novel IBD relevant associations include AIM2(16 cis-meQTLs; top p=2.83×10 -16 ).
In order to determine the causal role of DNA methylation in IBD, Mendelian randomisation was applied to our dataset using TwoSampleMR 24 . The most signi cant meQTLs for each CpG (sentinel meQTL) were generated using all snps and methylation probes independent of a diagnosis of IBD (Supplementary   Table 13). Using sentinel meQTLs as the instrument variable, methylation as exposure variable and IBD as outcome variable, no causal associations were identi ed in our dataset.

Integrative analysis identify immune cell related activation in IBD
Multi-omics analysis was performed using Multi-omics Factor Analysis (MOFA) 25 . Integration of IBDrelated SNPs and mRNAs around the 137 DMPs produced ten factors, of which the rst four explained most of the variability in the dataset (Fig. 3A). Factor1 and Factor 4 strongly correlated with IBD and hsCRP (Fig. 3B&E). These 2 factors were mostly in uenced by DMPs and mRNAs ( Fig. 3C) and remained independent of each other (r=0.052, Fig. 3D). Factor1 was higher in IBD compared with controls regardless of in ammation ( Fig. 3F; p<2.20×10 -16 ) and appeared speci c for IBD. On the contrary, at null hsCRP, Factor 4 was similar in IBD patients and controls (p=0.374 when < 0.5 mg/L) but was found to be elevated in IBD overall ( Fig. 3D; p=4.21×10 -13 ) due to higher values relative to control starting at even marginally increased hsCRP. In UC, Factor1 re ected the extent of colitis (Fig. 3G). Thus, multi-omics integration revealed DMP-driven, IBD-and hsCRP-associated factors 1 and 4.
Factor 1 was de ned by greater methylation of genes such as CXCR6 and CD247, and reduced methylation of ZEB2 ( Fig. 3H-I). The transcriptomic repertoire of Factor 1 prominently featured S100 proteins and matrix metalloproteinases. Only one IBD GWAS SNP contributed to Factor 1: rs7495132 (CRTC3). In gene ontology, Factor1 was related to in ammation with activation of undifferentiated leukocytes, and lipid metabolism (Fig. 3J).
The composition of Factor 4 shows that IBD-related differential methylation at VMP1 may relate to opsonization and phagocytosis and testi es of close coupling of pro-and anti-in ammatory responses.
DNA methylation associates with disease course in In ammatory Bowel Disease Follow up data were available for 291 patients with IBD in order to identify methylation markers that predicted treatment escalation in IBD over a median follow-up period of 526 days (IQR:223-775)(

Discussion
Are geographical replicability and variability preferentially driven by germline variation or alterations in the exposome?
In complex diseases such as IBD, DNA methylation potentially represents a mechanism at the interface between genetics and the environment. The strengths of this prospective case-control study include recruitment of individuals with a new diagnosis of IBD, mostly naïve to medical therapy, across multiple clinical centres in Europe. Importantly our data strongly replicate, validate and extend previous key DMP ndings from children with CD and adults in the index Scottish population where our initial data were generated 8, 9 to Scandinavia. We demonstrate signi cant evidence for dysregulation of several previously implicated loci, notably VMP1, RPS6KA2 and SBNO2 across Scandinavia and UK, populations with a shared genetic ancestry. Replication was less evident in the smaller Southern European cohort recruited in Spain. Despite being adequately powered to detect signi cant differences in 11 of the 34 probes tested, signi cant differential methylation for top probes such as ZEB2 and SBNO2 were not seen in the Spanish cohort. RPS6KA2 remains the only consistent signal across UK, Scandinavia and Spain. These differences may be in line with the North-South gradient in IBD that have been reported across USA and Europe 26-28 . Whilst more detailed studies in Southern Europe are needed, these data may suggest population speci city of several methylation changes in IBD and heighten the focus on exploring genetic in uences and factors within the exposome in these populations.
DMPs implicated in our study such as VMP1, RPS6KA2 and SBNO2 were also highly signi cant in the paediatric studies at diagnosis (RISK consortium; USA and Canada) 10 . It remains to be seen whether the methylation differences seen within our predominantly Caucasian non-Jewish cohort are replicated in other non-Caucasian populations. Population-speci c methylation differences at 439 CpG sites were reported in a study that included Caucasian-American, African-American and Han Chinese-American healthy individuals and associated these changes with distinct phenotypic characteristics such as drug metabolism, disease susceptibility and appearance 29 . Although, the majority of these differences were due to underlying genetic variations, up to 1/3 rd of the DMPs were independent of germline variations, bringing into focus the role of non-genetic in uences such as the exposome on the epivariance across populations. Similar ndings were also reported in a Hispanic origin cohort where genetic ancestry explained up to 3/4 th of the variance in methylation 30 and an Indonesian cohort where up to 10% of genes showed differential expression and methylation patterns between islands, likely attributed to smallscale environmental differences 31 . It remains to be determined whether the differences which are apparent between Northern Europe and Southern Europe within our study are genetically determined or whether these are related to the exposome. Further multiregional epigenome studies are now needed to explore this further in IBD.
The in uence of genetics on DNA methylation in IBD The differences between populations focus interest on the relative importance of germline variation on methylation; and in turn on gene expression. Our data also shed new light on the relationship between germline variation at speci c loci and epigenetic alterations in complex disease. A key locus, VMP1 showed 143 cis-meQTLs across 6 probes and includes meQTLs with a known GWAS snp rs1292053 and its LD snp( rs8078424, top p=1.48×10 -20 ) 9 . However, causal inference using mendelian randomisation could not be determined. A recent study in CD demonstrated that DNA methylation at 3 CpG sites causally associated with a GWAS snp rs1819333 using mendelian randomisation analysis, likely through transcriptional regulation of RPS6KA2 expression 10,32 . This snp was not on our genotyping platform and further validation could not be performed in this study. RPS6KA2 speci c hypomethylation has consistently been demonstrated in IBD across several independent cohorts and across all ages 8-10 . This gene encodes for a ribosomal kinase, a member of the serine/threonine kinase family and regulates the autophagy associated mTOR pathway and particularly relevant in CD 33 . These data provide insight into the complex interaction of genetics and epigenetics in the pathophysiology of IBD.

Pro-in ammatory pathways
Of particular interest within the eld of DNA methylation has been to address whether the changes seen are causal or a consequence of disease initiation. Somineni et al explored this in the paediatric cohort through several methods including longitudinal dynamics of methylation, correlation with known in ammatory markers and mendelian randomisation 10 . Although the vast majority of methylation signals in the Somineni et al study represented a consequence of disease initiation through in ammation, there were 10 CpGs that were unchanged at follow up 34 . In our studies, some of the methylation alterations have been shown to associate with CRP levels 21 . Even though these changes may be a consequence of in ammation, the genes implicated are of great interest as potential targets for future drug discovery and the mechanisms of how these genes associate with in ammation, disease onset and disease course need further exploration. An example includes OSM, a gene that has been recently shown to be upregulated in in amed intestinal tissue in IBD and is pro-in ammatory. This gene is able to predict anti-TNF response in IBD, making it a useful clinical biomarker and a potential drug target 35 . In our study, a 450K probe close to the TSS (TSS1500, p=8.6´10 -11 ) of OSM is differentially methylated in IBD compared to controls and demonstrates correlation with OSM gene expression in IBD compared to controls (IBD r -0.32, p 3.64×10 -7 vs. non-IBD r -0.14, p=0.77; TSS1500). Other signals linked with CRP-associated DMPs have relevance in IBD pathogenesis. One of the top signals is P-selectin glycoprotein ligand-1(PSGL-1 or SELPLG) and plays a critical role in immune cell recruitment to sites of tissue in ammation 36 . Anti-PSGL-1 antibody is currently in phase 2 drug trial for the treatment of CD (NIH project no. 1R44DK085845-01A1). Furthermore, differential methylation within this gene correlated with PSGL-1 expression in our study(IBD r -0.53, p=2.24×10 -22 vs -0.37, p=6.34×10 -8 ; 5'UTR). It is known that a normal level of DNA methylation is required to control differential expression of maternal and paternal alleles of imprinted genes 37 and plays an important role in cell differentiation and embryonic development. Studies have shown that methylation at gene promoter regions can vary depending on cell type with hypermethylation corresponding with low or no gene transcription 38,39 .
Multi-omic data integration using MOFA revealed 2 factors (Factor 1 and 4) strongly associated with IBD and pro-in ammatory pathways. While Factor 1 was de ned by greater methylation within CXCR6 and CD247 genes, reduced methylation of ZEB2 (Fig. 3H-I) and mRNA expression predominantly featuring S100 proteins and matrix metalloproteinases, Factor 4 however was de ned by reduced methylation at VMP1 (TMEM49) and the interferon-inducible in ammasome trigger AIM2. The CpGs and mRNA repertoire that de ne these 2 factors highlight the role immune cell activation, cellular response and regulation play in shaping the circulatory methylome and transcriptome in patients with IBD at disease inception. These data provide a repertoire of novel targets to develop future drug therapies that target these pro-in ammatory pathways at disease onset.

Age acceleration in In ammatory Bowel Disease
There is emerging interest in studying disease-associated age acceleration, de ned as the difference between predicted age (determined by DNA methylation patterns) and chronological age. Recent studies have linked epigenetic age acceleration to all cause mortality 40,41 and cardiovascular disease outcomes 42 . Epigenetic age acceleration is yet to be explored in the context of immune-mediated diseases such as IBD. In our cohort, signi cant DiffAge was seen between IBD and controls (non-IBD:  (Table 5). In IBD, TAP1 differential methylation has been previously reported as a CD speci c signal in IECs from small intestine and colon and reported to be a regulatory DMR (rDMR: differentially methylated region (DMRs) that is located within 10kb of the transcription start site of a differentially expressed gene) 20 . RPTOR gene forms part of the mTORC1 complex which has been shown to promote UC through COX-2 mediated Th17 responses 45 . Depletion of RPTOR inactivates mTORC1 and ameliorates UC 45 . These differentially methylated genes also differ from those implicated in disease susceptibility. Similar trends have been seen in GWAS in CD where unique prognostic genes have been identi ed that do not overlap with disease susceptibility 14 .
Several studies including GWAS have utilised the CD8 + T-cell derived transcriptome criteria for escalation, de ned as the need for 2 or more immunosuppressants' after initial disease remission 13,14,[46][47][48] . These studies collectively have provided the rst lines of evidence for a circulating biomarker pro le to de ne a subset of patients that require treatment escalation who perhaps may bene t from a 'top-down' approach to management at an early stage. Our data complement these ndings and provide further molecular depth by de ning their methylome signature.
Within the paediatric CD group, DNA methylation in separated circulating CD8 + T cells has been shown not to predict outcomes in CD 49 . Given that CD8 + cells only represent a small proportion of the cellular compartment in whole blood, our signals warrant further exploration of the prognostic methylome in other immune cells. It is noteworthy in this context, the e-FORGE analysis in our dataset preferentially implicates monocytes. Furthermore, our prognostic analysis includes UC; the molecular prognostic pro le of this subtype has not been examined previously. Further large multicentre studies are now needed to explore these ndings further both in adults and children.

Strengths and limitations
Our study has recruited over four years one of the largest multi-centre European inception cohorts reported to date, and generated IBD speci c methylation signals, associated with disease onset and progression. We have explored geographical, genetic and non-genetic correlates as mechanistic as well as translational implications. We have previously demonstrated that the most dysregulated areas implicated are cell-speci c 9 . There are certain methodological considerations to take into account when interpreting our data. Firstly, our dataset is adequately powered in Scandinavia and UK, but we acknowledge that the samples size will detect some, but not all DMPs in Southern Europe (Spain). We have demonstrated, in the Spanish cohort, replication for RPS6KA2, one of the most robust ndings in Northern Europe and North America, whilst excluding a signi cant effect for ten other markers all highly signi cant in Scandinavia.
One of the challenges of methylation studies in in ammatory diseases is to disentangle the effect of in ammation on the top differentially expressed DMPs. We have approached this in appropriate detail by several methods including correlation analyses with in ammatory markers such as hsCRP and matching our top probes with publicly available in ammation-associated DMPs.
There are also potential challenges in the prognostic analysis of data from multi-centre studies associated with the concern that decision-making can vary across centres. However, in our study all European centres used similar criteria and guidelines for decisions on escalation in IBD (also known as 'step-up' approach) where increments are made in therapy based on response to the initial treatment.
Although we were able to identify and internally validate IBD-speci c prognostic signals in this analysis , IBD subtype analysis and validation was not feasible due to the small number of patients who escalated therapy within each subtype. It is worth noting, however, that certain prognostic signals such as TAP1 have been shown to be CD speci c in the published literature. Future studies are needed to further validate our ndings.

Conclusions
This study highlights the stability of the IBD-speci c circulating methylome across regions with shared ancestry. We demonstrate a close association of the methylome with in ammation and through integrative multi-omic analyses we identify key pro-in ammatory genes that are upregulated in IBD at inception. Furthermore, differential methylation within certain genes such as TAP1 associate with disease course over time. These data provide a rich resource for future translational studies investigating the epigenome in IBD; and potentially represent a paradigm for analysis in other complex diseases.

Study design
Patients were recruited prospectively as part of the IBD-CHARACTER inception cohort (reference 305676) from gastroenterology appointments across 7 centres in Europe (Table 1). All IBD cases met the standard diagnostic criteria for either UC or CD following thorough clinical, microbiological, endoscopic, histological, and radiological evaluation. The Lennard-Jones, Montreal and Paris criteria were used for diagnosis and classi cation of clinical phenotypes. The control group consisted of symptomatic controls attending gastroenterology clinic during the same period with no evidence of IBD after further investigations and at follow up. Healthy volunteers were also recruited into the study.
We collected patient demographics including sex, age at diagnosis, and date of diagnosis. Details of drug therapy and concomitant medications were recorded. Treatment naïvety within the IBD cohort was de ned as no exposure to any IBD-related medical therapies such as oral and topical steroids, oral and topical 5-ASA therapies, biological therapies and immunomodulators. High-sensitivity CRP (hsCRP), albumin were re-assayed in a single batch at the end of recruitment. Other routine markers including haemoglobin, white cell count were also recorded.
Patients with IBD were followed prospectively and information on clinical outcomes were collected during follow-up. Treatment escalation was de ned as the need for a biologic, ciclosporin or surgery, instituted for disease are after initial induction therapy and aiming to induce disease remission. In UC, the de nition of treatment escalation also included colectomy during index admission.
All patients and controls provided written, informed consent with local ethical approval at each centre.
Genome-wide methylation pro ling Peripheral blood leukocyte DNA was bisulphite converted and analysed using the Illumina HumanMethylation450 platform (Illumina, San Diego, CA, USA). 50 Cases and controls were randomly distributed across chips. Data were processed using the me l package 51 in R (R Foundation for Statistical Computing, Vienna). Samples containing >1% probes with detection p values >0.01 were removed. Probes with bead counts <3 in 10% samples, or detection p values>0.01 in 10% samples were also removed. Sex mismatches were identi ed by analysing the median intensities of the sex chromosome probes and removed from further analyses. Genotypes were compared with genotyping probes on the methylation array. Probes containing snps with a minor allele frequency of ≥0.01 in the European population in the 1000 Genomes Project were also removed.
Three samples with low signal intensities were removed (>1% probes with detection p values >0.01) and 684 probes were ltered out due to low signal. There were n=15 failed quality control that were removed from further analyses.

Differentially methylated positions and regions
Cell proportions were estimated from methylation data using the Houseman algorithm 52 using the me l package in R. Differentially methylated position analysis (DMP, single CpG probe) was performed using age, sex and cell proportions as covariates 53 . Statistical signi cance was set at p<0.05 following adjustment for multiple testing using Holm correction 54 for whole blood data.
Differentially methylated regions (DMRs) were identi ed using the Lasso function from the ChaMP pipeline 55,56 (distinct from the lasso function described below in the biomarker discovery section) and de ned as three or more contiguous probes within a distance threshold based on expected probe density, each achieving a FDR corrected p<0.05 in DMP analysis.

Epigenetic clock and Age acceleration
Age acceleration (AgeAccel) is de ned as the residual resulting from a linear regression model, regressing the Horvath estimate of epigenetic age (biological age) on chronological age 7 . A positive value for AgeAccel indicates that the observed epigenetic age is higher than that predicted, based on chronological age. DiffAge according to Horvath was de ned as the differences between predicted biological age and chronological age 7 . A positive value for DiffAge is seen for individuals with an increased biological age compared to their chronological age 23 .

Methylation Quantitative Trait Loci Analyses (MeQTL) and Mendelian Randomisation
Whole blood leukocyte DNA were extracted using the Nucleon BACC 3 DNA extraction kit (GE Healthcare, Buckinghamshire, UK). Patients were genotyped using the Illumina HumanOmniExpressExome-8 Bead Chips (Illumina, San Diego, CA, USA). A sex-check was performed using PLINK to identify and remove sexmismatches. MeQTLs and eQTLs were estimated using the Matrix-eQTL package 57 , with a distance threshold of 1Mb. snps with a minor allele frequency of <5% were ltered from downstream analyses. Age and sex were included as covariates in order to identify speci c disease-associated meQTLs. A Holm corrected p <0.05 was used as the threshold for statistical signi cance for disease associated meQTLs.
Mendelian randomisation was performed using TwoSampleMR to determine causal inference 24 for the top meQTLs. For this analysis meQTLs, independent of IBD were obtained, using the same ltering thresholds as described above. MeQTL snps were used as the instrumental variable, CpG as the exposure, and IBD as the outcome variable.
Gene expression pro ling

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.