Study demographics
The IBD Character cohort represents a multi-centre inception cohort in which 247 (72%) of the 343 IBD patients were treatment naïve at recruitment. Genome-wide methylation was measured in 638 DNA samples extracted from peripheral blood (295 controls, 154 CD, 161 UC and 28 IBD-unclassified (IBD-U) patients). Table 1 summarises study recruitment and patient demographics. The mean age in patients with IBD (n=343) was 34 years (range 7–79), and 33 years (range 3-79) in controls (n= 295). A total of 27% of CD patients had a colonic disease phenotype at recruitment while 42% of patients with UC exhibited extensive colitis. Two hundred individuals were recruited in the UK, 367 in Scandinavia and 66 in Spain at presentation for investigation of suspected IBD. A total of 33% of those recruited in the UK had confirmed IBD after investigation , while 58% and 39% had IBD in the Scandinavian and Spanish cohort respectively (Table 1).
Differentially methylated probes in Inflammatory Bowel Disease
Across the entire cohort, 137 probes exhibited Holm significant IBD-associated methylation differences in comparing IBD with controls (Supplementary Table 1). These include probes mapping at the loci VMP1/MIR21 (p=9.11×10-15), SBNO2 (2.70×10-14), RPS6KA2 (6.43×10-13), and TNFSF10 (7.72×10-8), thereby replicating and validating our previous findings9. Novel findings include differential methylation of PHOSPHO1 (3.43×10-9) and SELPLG (2.54×10-7). Table 2 summarises the top differentially methylated positions (DMPs).
Similar analyses were performed to identify UC and CD specific DMPs. There were 72 DMPs that differentiated CD from controls (Supplementary Table 2) and 67 DMPs that differentiated UC from controls (Supplementary Table 3).There were 24 DMPs that demonstrated overlap across UC and CD analyses (Supplementary Figure 1). There were no probes that differentiated UC from CD. Analysis was performed of Differentially Methylated Regions (DMRs), defined as regions with ≥ 3 contiguous probes with FDR corrected p<0.05 within a distance threshold based on expected probe density. The VMP1 locus on chromosome 17 was the only DMR identified within our dataset that remained significant using these criteria.
Next, we compared our dataset with the previous findings from our group on DMPs in adult and paediatric IBD by correlation analyses of the top 1000 methylation probe beta values in the 3 cohorts (Supplementary Figure 2)8,9. Strong correlation was seen across populations (adult BIOM cohort9 r 0.96, p< 2.20×10-308; paediatric cohort r 0.81, p<2.20×10-308).
Consistency of Differentially methylated positions (DMPs) across Northern Europe
We then analysed the consistency of methylation across Europe (Figure 1) by splitting our cohort based on geographic area (Scandinavia vs UK vs Spain). Independent DMP analysis using all probes was performed in Scandinavia, identifying 34 probes that differentiated IBD from controls with a Holm p<0.05 (Table 3). These included SBNO2 (p= 2.88×10-9), VMP1 (p=6.89×10-8) and RPS6KA2 (p=1.57×10-3). A total of 26 of these probes were significant in the UK cohort (n=200). In the Spanish cohort (n=66) only one (RPS6KA2) remained significant (Holm p=0.03). Power calculations were performed in view of these findings, taking into consideration the effect sizes noted in the Scandinavian cohort. This post-hoc analysis confirmed that the Spanish cohort was adequately powered to detect significant differences in 11 of the 34 DMPs identified in the Scandinavian dataset (power=0.8, alpha=0.05, effect size cut-off=0.8; Supplementary Table 4). DMPs consistently replicated across samples from UK and Scandinavia notably ZEB2, SBNO2 and ZPLD1 were not detected in samples from Spain despite being adequately powered to detect these DMPs.
Deciphering tissue and immune cell specificity of differentially methylated probes in IBD
Blood cell-type-specific DNase hypersensitive sites (DHS) enrichment testing using the eFORGE v2.0 online tool19 was performed for our top 137 differentially expressed methylation probes. This demonstrated that 62 of the 137 IBD associated CpGs are enriched in DHSs within monocytes (binomial p=2.04×10-8). These encompassed the top-most differentially expressed DMPs in disease such as OSM, VMP1, SELPLG (PSGL-1) and AIM2. No enrichments were seen in other immune cells such as T-cells, B-cells, NK cells or haematopoietic stem cells (Supplementary Figure 3).
We then examined for overlap between the circulating blood DMPs in CD and UC in this study and those differentially methylated in human colonic intraepithelial cells (IECs) in CD and UC cohorts20. A total of 38 and 5 probes overlapped in UC and CD respectively (Supplementary table 5). Top IEC signals with overlap in circulating methylome across both UC and colonic CD included SBNO2, RPS6KA2 and SELPLG. Overlapping probes unique to UC included AIM2, MAD1L1, MIR4470 and MIR3679.
The association of DNA methylation with inflammation
In order to better understand the influence of inflammation on the top differentially methylated probes, 3 distinct analyses were performed. These included correlations with inflammatory markers, DMP analysis using inflammatory markers such as hsCRP as covariates and DMP overlap with published inflammation associated methylation probes.
We investigated the correlation of the 137 differentially expressed DMPs in IBD with inflammatory markers, i.e. high sensitivity c-reactive protein (hsCRP) and albumin, in individuals with complete data (n=591; Supplementary Figure 4). A total of 98 probes correlated with concentration of hsCRP and 118 with albumin levels. Most significant correlations with hsCRP included LIPC (cg27307975, r -0.56, p= <2.22 x10-16) and ZEB2 (cg20995564, r -0.48, p= <2.22 x10-16). Top DMPs such as VMP1 (cg12054453, r 0.25, p= 3.78x10-8) and RPS6KA2 (cg17501210, r -0.30, p= 7.81x10-12) showed moderate correlation with hsCRP. There were however 39 probes that did not correlate with hsCRP and included another VMP1 probe (cg02782634, r -0.13, p=0.08) (Supplementary Table 6). In order to adjust for inflammatory activity, we also performed DMP analysis with previously described covariates and with the addition of hsCRP. A total of 59 probes remained significant and included SBNO2(p=5.26×10-14), RPS6KA2 (p=4.51×10-11) and VMP1 (p=5.36×10-10) (Supplementary Table 7).
We then compared our top DMPs to those that have been reported to be associated with CRP levels21. A total of 43 of the 218 DMPs identified in the Lighthart et al were differentially methylated in our study. There were however 94 probes that still associated with IBD and were independent of inflammation associated probes (Supplementary Table 8). These included CLU (cg16292768), SBNO2 (cg12170787) and VMP1(cg02782634).
Finally, in order to identify inflammation independent DMPs, we excluded probes that correlated with hsCRP or reported inflammation-associated DMPs and identified 30 DMPs that showed no overlap with published inflammation associated DMPs or any correlation with inflammation (Supplementary Table 9).
The association of DNA methylation with smoking
Smoking can affect DNA methylation at certain CpG sites in blood and a recent systematic review summarised a total of 1460 smoking-associated CpG sites22. None of the 137 probes identified in this study overlap with the reported smoking-associated CpGs. Given the divergent effects of smoking on IBD subtypes, we performed analyses for probes that associate with CD or UC by including smoking, age, sex and cell proportions as covariates. In CD, 107 DMPs remained significant; SBNO2, RPS6KA2 and VMP1 being the top probes (Supplementary Table 10). In UC, 59 probes remained significant: SBNO2, VMP1 and ZEB2 being the most significant probes (Supplementary Table 11).
‘Epigenetic age’ acceleration and its association with Inflammatory Bowel Disease
The ‘epigenetic age’ of the patients was calculated using the methodology described by Horvath et al7. There was a strong correlation between actual age and ‘epigenetic age’ in this cohort (r=0.94, CI: 0.93-0.95, p<2.20x10-16; Supplementary Figure 5). Age acceleration (AgeAccel) was seen in IBD (coefficient 0.94, p<2.2x10-16). DiffAge, defined as the difference between predicted biological age and chronological age was determined for IBD and controls (including subtypes)7,23. There were significant DiffAge seen between IBD and controls (non-IBD: median 4.34 years (IQR:3.83-4.70) vs. IBD: 5.28 years(IQR: 4.72-5.64); p<2.20x10-16). Differences were also seen between IBD subtypes compared to non-IBD (vs. UC: 5.08 years (IQR: 4.51-5.49); p<2.20x10-16 and vs. CD: 5.53 years (IQR: 4.99-5.79); p<2.20x10-16; Supplementary Figure 5). There was poor correlation seen between DiffAge and inflammatory markers in the entire cohort (hsCRP: r 0.09, p=0.03; Alb: r -0.13, p=2.88 x10-3). No correlations were seen between DiffAge and treatment exposure in UC or CD (UC p=0.89; CD p=0.21).
Germline variations show a strong correlation with DNA methylation (meQTLs)
Using paired genetic and methylation data for the entire cohort (n=638) and age, sex as covariates, meQTLs were generated using the top DMPs using the entire cohort (DMPs n=137). A total of 2991 cis-meQTLs were identified. After applying a MAF >0.05 and holm adjustment, 341 cis-meQTLs remained significant across 21 unique genes, indicating a strong genetic influence on methylation. Several key loci that were significantly differentially methylated in IBD had a strong genetic influence including ITGB2 (7 cis-meQTLs; top p=2.83×10-16), and 143 VMP1/MIR21 cis-meQTLs across 6 probes (Supplementary Table 12). This includes meQTLs with a known GWAS single nucleotide polymorphism (snp) rs1292053 and also with its previously reported LD snp ( rs8078424, r2=0.43, top p=1.48×10-20)9. Other novel IBD relevant associations include AIM2(16 cis-meQTLs; top p=2.83×10-16).
In order to determine the causal role of DNA methylation in IBD, Mendelian randomisation was applied to our dataset using TwoSampleMR24. The most significant meQTLs for each CpG (sentinel meQTL) were generated using all snps and methylation probes independent of a diagnosis of IBD (Supplementary Table 13). Using sentinel meQTLs as the instrument variable, methylation as exposure variable and IBD as outcome variable, no causal associations were identified in our dataset.
IBD associated genes are differentially methylated and correlate with gene expression
In this study, paired whole blood gene expression data was available in 590 patients. Of the 137 DMPs identified, 24 probes were located either 200 or 1500 nucleotides upstream of the transcriptional start sites (TSS200 and TSS1500), within regulatory regions of a gene (5’untranslated region (5’UTR), 3’ untranslated region (3’UTR)) or body of the gene. In this study, we discovered 81 highly significant correlations (Supplementary Table 14) with a series of novel genes demonstrating high correlation between methylation alterations and expression including CD247 (Body, r= -0.69, p=4.70×10-80), SELPLG ( 5’UTR, r= -0.56, p=3.10×10-46), OSM (TSS1500, r= -0.33, p=2.48×10-14) and AIM2 (TSS1500, r= -0.51, p=6.32×10-37). Further sub-analysis was performed investigating the correlation differences in IBD and non-IBD cohorts (Supplementary Table 15). SELPLG expression and methylation demonstrated a relative higher negative correlation within IBD than non-IBD (IBD r -0.53, p=2.24×10-22 vs -0.37, p=6.34×10-8; 5’UTR). A total of 14 probes demonstrated significant correlation in IBD patients but no correlation in non-IBD (Figure 2). Examples include ZEB2 demonstrated negative correlation in IBD (r -0.45, p=6.43×10-15; gene body) and no significant correlation in controls (r -0.16, p=0.29). Similarly, OSM expression negative correlated with methylation within IBD cases but no correlation seen within controls (IBD r -0.32, p 3.64×10-7 vs. non-IBD r -0.14, p=0.77; TSS1500). There were 6 genes with methylation probes within TSS200 region, all demonstrating negative correlations with DNA methylation apart from SYNE2 (IBD r 0.31, p=2.89×10-6 vs. non-IBD r 0.22, p=0.02).
Integrative analysis identify immune cell related activation in IBD
Multi-omics analysis was performed using Multi-omics Factor Analysis (MOFA)25. Integration of IBD-related SNPs and mRNAs around the 137 DMPs produced ten factors, of which the first four explained most of the variability in the dataset (Fig. 3A). Factor1 and Factor 4 strongly correlated with IBD and hsCRP (Fig. 3B&E). These 2 factors were mostly influenced by DMPs and mRNAs (Fig. 3C) and remained independent of each other (r=0.052, Fig. 3D). Factor1 was higher in IBD compared with controls regardless of inflammation (Fig. 3F; p<2.20×10-16) and appeared specific for IBD. On the contrary, at null hsCRP, Factor 4 was similar in IBD patients and controls (p=0.374 when < 0.5 mg/L) but was found to be elevated in IBD overall (Fig. 3D; p=4.21×10-13) due to higher values relative to control starting at even marginally increased hsCRP. In UC, Factor1 reflected the extent of colitis (Fig. 3G). Thus, multi-omics integration revealed DMP-driven, IBD- and hsCRP-associated factors 1 and 4.
Factor 1 was defined by greater methylation of genes such as CXCR6 and CD247, and reduced methylation of ZEB2 (Fig. 3H-I). The transcriptomic repertoire of Factor 1 prominently featured S100 proteins and matrix metalloproteinases. Only one IBD GWAS SNP contributed to Factor 1: rs7495132 (CRTC3). In gene ontology, Factor1 was related to inflammation with activation of undifferentiated leukocytes, and lipid metabolism (Fig. 3J).
Reduced methylation at VMP1 (TMEM49) and the interferon-inducible inflammasome trigger AIM2 were driving Factor 4, which also shared lowered ZEB2 methylation with Factor1 (Fig. 3H-I). Gene expression behind Factor 4 included the immunosuppressive CD274 (PD-L1), along with Fc gamma receptor 1 (FCGR1A, FCGR1B) and AIM2 (overexpressed, consistent with lower methylation; Fig. 3J). The GWAS polymorphism rs1801274 (FCGR2A) was negatively correlated with Factor 4. Pathway analysis uncovered potential roles for Factor 4 in immune regulation, lipid metabolism, and cell death (Fig. 3K). The composition of Factor 4 shows that IBD-related differential methylation at VMP1 may relate to opsonization and phagocytosis and testifies of close coupling of pro- and anti-inflammatory responses.
DNA methylation associates with disease course in Inflammatory Bowel Disease
Follow up data were available for 291 patients with IBD in order to identify methylation markers that predicted treatment escalation in IBD over a median follow-up period of 526 days (IQR:223-775)(Table 4). Thirty nine patients with a diagnosis of CD, 26 patients with UC and 2 with IBD-U required escalation after a median follow-up time of 98 days (IQR:40-229). The median age in this group was 28 years (range: 18-67) and 58% were male (n=39). For downstream analysis, data were split into training (2/3rd) and testing set (1/3rd) for signal validation. In order to investigate DMPs that associate with treatment escalation, principal component (PC) analysis using all methylation probes was performed to identify PCs that associate with treatment escalation. In the training set (n=194; 40 escalations), 11 PCs significantly associated with treatment escalation (Supplementary Figure 6). The first 2 PCs correlate with cell proportions, however certain PCs such as PC17 only associate with treatment escalation. Probes that represent top/bottom 5% of variance within this principal component are shown in Supplementary Figure 7.
Probes that represented the top 5% of variance across all 11PCs that associate with treatment escalation in the training set were selected (n=55) for further LASSO penalised Cox regression. A total of 3 methylation probes remained significant predictors (Table 5) in IBD. These included Transporter associated with Antigen Processing 1 (TAP1; HR: 12.79), Regulatory protein of MTOR complex 1(RPTOR; HR:1.47) and Thymocyte expressed, positive selection associated 1 (TESPA1; HR: 1.29). A model incorporating these 3 probes, was then tested in an independent testing set (n=97; 27 escalations). The 3-probe combined model remained a significant predictor of treatment escalation in the testing set with a HR of 5.19 (CI:2.14-12.56, logrank p=9.70×10-4; Figure 4). Adjusting for treatment naivety did not influence the top differentially methylated probes among patients with IBD. Similar analyses were performed to generate a model that included known clinical predictors: age, sex, hsCRP and albumin in the training set (n=174) and validated in the testing set (n=88) where data for covariates were complete. This clinical model had an HR of 4.55 (CI:2.00-10.35; p=1.03×10-4) and performed at par with the methylation model. Supplementary Table 16 shows correlations of the top 3 prognostic DMPs with conventional inflammatory markers.