DNAm exposures in tumour and normal tissue
For all the DNAm exposures, the majority of the associated CpG sites beta-values were moderately and significantly correlated between the patients tumour and adjacent normal tissue, indicating that the exposures were similarly affecting tumour and normal tissue within each individual. The normalised distribution of the patients tumour versus adjacent normal correlation coefficients for each DNAm exposure are shown in Fig. 1A. For the DNAm alcohol, BMI and smoking exposure CpG sites, the median of the Spearman’s correlation coefficients was 0.35 (interquartile range (IQR): 0.27–0.44), 0.36 (IQR: 0.27–0.45) and 0.35 (IQR): 0.27–0.44) respectively. Furthermore, there was no relationship between the DNAm exposure signature CpG site beta coefficients and their correlation coefficients between the patients tumour and normal tissues (Fig. 1B). After the hierarchical clustering analysis, the DNAm alcohol and BMI exposure signature CpG beta-values did not have a tendency to separate into normal and tumour tissue samples, indicating that these groups did not cluster separately (Figs. 2A, 2B). While, the DNAm smoking exposure CpG beta values did have more of a tendency to separate into normal and tumour tissue samples, indicating the existence of a systematic difference between the smoking exposure signature beta-values in both tissue types (Fig. 2C). DNAm alcohol and BMI exposure z-scores in patients tumour and adjacent normal tissue were moderately correlated, with Pearson’s correlation coefficients of 0.55 and 0.39 respectively (Figs. 3A and 3B). The DNAm smoking exposure z-scores in patients tumour and adjacent normal tissue was weakly correlated, with a correlation coefficient of 0.2 (Fig. 3C). This is consistent with the hierarchical clustering analysis findings, indicating that tumour and normal tissue methylation levels were different for the DNAm smoking exposure CpG sites.
DNAm exposures and cancer prognosis
The DNAm exposure associated HR analyses and reported exposures associated ORs for each of the 24 cancer types for cancer survival and risk respectively, are shown in Tables 1–3 with model fit statistics presented in Supplementary Table 2. These tables show that unadjusted high DNAm alcohol exposures were significantly associated with poorer survival in patients with bladder (BLCA) and brain (LGG) cancers and with better survival in thyroid (THCA) and kidney (KIRC, KIRP) cancers. However, after adjustment for potential confounding factors, high DNAm alcohol exposures remained significantly associated with survival in BLCA (HR = 1.3, 95% CI (1.0–1.6), p = 0.020), KIRP (HR = 0.47, 95% CI (0.3–0.8), p < 0.01), LGG (HR = 1.6, 95% CI (1.3–2.0), p < 0.0001), and became significantly associated with poorer survival in esophageal (ESCA) (HR = 1.5, 95% CI (1.0–2.1), p = 0.030), and head and neck (HNSC) (HR = 1.3, 95% CI (1.0–1.6), p = 0.042) cancers. The unadjusted high DNAm BMI exposures were significantly associated with poorer survival in patients with bladder (BLCA), postmenopausal breast (BRCA), brain (LGG), pancreatic (PAAD) and rectal (READ) cancers and with better survival in kidney (KIRC) cancer. However, after adjustment for potential confounding factors, high DNAm BMI exposures remained significantly associated with survival in BLCA (HR = 1.2, 95% CI (1.0–1.4), p = 0.015), postmenopausal BRCA (HR = 1.44, 95% CI (1.1–2.0), p = 0.018) and PAAD (HR = 2.1, 95% CI (1.5–3.0), p < 0.0001) cancers. The unadjusted DNAm smoking exposures were significantly associated with poorer survival in patients with B-cell lymphoma (DLBC), lung (LUSC) and stomach (STAD) cancers and better survival in kidney (KIRC, KIRP) cancers. However, after adjustment for potential confounding factors, the DNAm smoking exposures remained significantly associated with survival in DLBC (HR = 2.7, 95% CI (1.1–6.5), p = 0.028), KIRC (HR = 0.7, 95% CI (0.54–0.9), p < 0.01), LUSC (HR = 1.2, 95% CI (1.0–1.5), p = 0.049), and STAD (HR = 1.3, 95% CI (1.1–1.7), p = 0.016) cancers and became significantly associated with poorer survival in bladder cancer (BLCA) (HR = 1.2, 95% CI (1.0–1.5), p = 0.034). For 14 out of 24 tumour types we were also able to adjust for response to first line treatment (complete response versus stable disease/progression) as a potential confounder and found the majority of results remained the similar (Supplementary Table 3).
Table 1
Summary of DNAm alcohol exposure cancer survival and reported exposure cancer risks. For 24 cancer types, the DNAm alcohol exposure associated HRs for cancer survival were calculated from univariable and multivariable Cox proportional hazard model analysis and the reported alcohol exposure associated ORs for cancer risk were gathered from meta-analyses in the literature. DNAm alcohol z-scores were used as the dependent variable in the univariable and multivariable Cox proportional hazards models, with the multivariable HR analysis also adjusted for age at diagnosis, TNM stage (where applicable) and DNAm smoking exposure scores.
Alcohol
|
Cancer
|
Cancer Risk
|
Univariable - survival
|
Multivariable - survival
|
|
OR (95% CI)
|
HR (95% CI)
|
P-value
|
N
|
HR (95% CI)
|
P-value
|
N
|
BLCA
|
0.95 (0.75–1.20)
|
1.40 (1.10 - 1.70)
|
**
|
409
|
1.30 (1.00–1.60)
|
0.020
|
407
|
BRCA
|
1.61 (1.33–1.94)
|
1.10 (0.82–1.40)
|
0.680
|
774
|
1.30 (0.97–1.70)
|
0.087
|
763
|
CESC
|
0.90 (0.73–1.11)
|
1.20 (0.91–1.50)
|
0.230
|
299
|
1.10 (0.86–1.40)
|
0.400
|
292
|
CHOL
|
2.64 (1.62–4.30)
|
1.70 (0.81–3.70)
|
0.160
|
36
|
1.70 (0.70–4.10)
|
0.240
|
36
|
COAD
|
1.44 (1.25–1.65)
|
1.10 (0.76–1.70)
|
0.550
|
290
|
1.00 (0.67–1.60)
|
0.920
|
280
|
DLBC
|
0.75 (0.64–0.88)
|
2.40 (0.83–7.10)
|
0.100
|
47
|
1.20 (0.25–5.60)
|
0.840
|
41
|
ESCA
|
4.95 (3.86–6.34)
|
1.40 (1.00–1.90)
|
0.050
|
174
|
1.50 (1.00–2.10)
|
0.030
|
168
|
GBM
|
1.45 (0.69–3.08)
|
1.30 (0.92–1.70)
|
0.160
|
124
|
1.20 (0.88–1.70)
|
0.230
|
124
|
HNSC
|
5.13 (4.31–6.10)
|
1.20 (0.97–1.50)
|
0.086
|
523
|
1.30 (1.00–1.60)
|
0.042
|
523
|
KICH
|
0.79 (0.72–0.86)
|
0.57 (0.29–1.10)
|
0.100
|
66
|
0.79 (0.41–1.50)
|
0.500
|
66
|
KIRC
|
0.79 (0.72–0.86)
|
0.54 (0.40–0.75)
|
**
|
316
|
0.83 (0.60–1.20)
|
0.270
|
314
|
KIRP
|
0.79 (0.72–0.86)
|
0.56 (0.36–0.86)
|
*
|
274
|
0.47 (0.29–0.76)
|
*
|
256
|
LGG
|
1.45 (0.69–3.08)
|
1.40 (1.10–1.80)
|
*
|
504
|
1.60 (1.30–1.90)
|
***
|
504
|
LIHC
|
2.07 (1.66–2.58)
|
1.10 (0.89–1.30)
|
0.440
|
375
|
1.10 (0.92–1.40)
|
0.240
|
351
|
LUAD
|
1.15 (1.02–1.30)
|
0.84 (0.66–1.10)
|
0.150
|
455
|
0.79 (0.61–1.00)
|
0.064
|
451
|
LUSC
|
1.15 (1.02–1.30)
|
1.10 (0.89–1.40)
|
0.340
|
365
|
1.10 (0.87–1.40)
|
0.460
|
362
|
OV
|
1.03 (0.95–1.12)
|
1.50 (0.61–3.70)
|
0.370
|
10
|
7.10 (0.98–52.00)
|
0.052
|
10
|
PAAD
|
1.19 (1.11–1.28)
|
1.20 (0.83–1.70)
|
0.360
|
184
|
1.20 (0.81–1.60)
|
0.430
|
181
|
PRAD
|
1.09 (0.98–1.21)
|
1.50 (0.49–4.80)
|
0.460
|
484
|
1.90 (0.59–6.50)
|
0.280
|
484
|
READ
|
1.44 (1.25–1.65)
|
0.64 (0.26–1.60)
|
0.330
|
94
|
0.75 (0.19–2.90)
|
0.680
|
85
|
STAD
|
1.44 (1.25–1.65)
|
1.20 (0.96–1.60)
|
0.110
|
393
|
1.30 (0.96–1.60)
|
0.100
|
382
|
THCA
|
0.81 (0.71–0.94)
|
0.41 (0.21–0.81)
|
0.011
|
502
|
0.59 (0.23–1.50)
|
0.270
|
500
|
UCEC
|
0.99 (0.84–1.16)
|
1.00 (0.79–1.30)
|
0.830
|
425
|
1.10 (0.83–1.40)
|
0.560
|
425
|
UCS
|
1.33 (1.01–1.76)
|
1.00 (0.66–1.50)
|
0.980
|
57
|
1.10 (0.67–1.70)
|
0.790
|
57
|
Abbreviations: confidence interval (CI), deoxyribonucleic acid methylation (DNAm), number of patients (N), hazards ratio (HR), odds ratio (OR) and tumour-node-metastasis (TNM) |
Significant HR associations are shown in bold |
* p < 0.01, ** p < 0.001, *** p < 0.0001 |
Table 2
Summary of DNAm BMI exposure cancer survival and reported exposure cancer risks.
BMI
|
Cancer
|
Cancer Risk
|
Univariable - survival
|
Multivariable - survival
|
|
OR (95% CI)
|
HR (95% CI)
|
P-value
|
N
|
HR (95% CI)
|
P-value
|
N
|
BLCA
|
1.05 (0.99–1.12)
|
1.30 (1.10–1.50)
|
*
|
409
|
1.20 (1.00–1.40)
|
0.015
|
407
|
BRCA1
|
0.89 (0.85–0.94)
|
1.68 (0.87–3.24)
|
0.120
|
167
|
1.84 (0.95–3.55)
|
0.070
|
165
|
BRCA2
|
1.05 (1.03–1.08)
|
1.56 (1.15–2.12)
|
0.004
|
495
|
1.44 (1.06–1.96)
|
0.018
|
492
|
CESC
|
1.14 (1.03–1.26)
|
1.00 (0.78–1.30)
|
0.900
|
299
|
0.96 (0.73–1.30)
|
0.780
|
292
|
CHOL
|
1.50 (1.21–1.85)
|
1.00 (0.72–1.50)
|
0.840
|
36
|
1.10 (0.72–1.80)
|
0.570
|
36
|
COAD
|
1.11 (1.07–1.15)
|
0.98 (0.65–1.50)
|
0.910
|
290
|
0.86 (0.57–1.30)
|
0.470
|
280
|
DLBC
|
1.00 (0.95–1.05)
|
1.10 (0.40–2.80)
|
0.890
|
47
|
0.95 (0.38–2.40)
|
0.920
|
41
|
ESCA
|
1.16 (1.09–1.24)
|
1.10 (0.78–1.40)
|
0.730
|
174
|
1.10 (0.75–1.50)
|
0.770
|
168
|
GBM
|
1.02 (0.94–1.10)
|
1.30 (0.77–2.10)
|
0.340
|
124
|
0.82 (0.49–1.40)
|
0.450
|
124
|
HNSC
|
1.07 (0.91–1.26)
|
1.00 (0.86–1.20)
|
0.910
|
523
|
1.00 (0.89–1.20)
|
0.570
|
523
|
KICH
|
1.25 (1.13–1.38)
|
1.90 (0.73–5.10)
|
0.180
|
66
|
2.00 (0.67–5.80)
|
0.220
|
66
|
KIRC
|
1.25 (1.13–1.38)
|
0.70 (0.53–0.93)
|
0.015
|
316
|
0.84 (0.63–1.10)
|
0.260
|
314
|
KIRP
|
1.25 (1.13–1.38)
|
0.97 (0.66–1.40)
|
0.900
|
274
|
0.91 (0.61–1.40)
|
0.650
|
256
|
LGG
|
1.02 (0.94–1.10)
|
1.60 (1.10–2.30)
|
0.021
|
504
|
1.10 (0.72–1.60)
|
0.740
|
504
|
LIHC
|
1.26 (1.14–1.40)
|
0.95 (0.78–1.20)
|
0.620
|
375
|
0.96 (0.77–1.20)
|
0.680
|
351
|
LUAD
|
0.99 (0.93–1.05)
|
1.00 (0.75–1.30)
|
1.000
|
455
|
0.96 (0.72–1.30)
|
0.800
|
451
|
LUSC
|
0.99 (0.93–1.05)
|
0.91 (0.72–1.10)
|
0.410
|
365
|
0.92 (0.72–1.20)
|
0.470
|
362
|
OV
|
1.08 (1.02–1.15)
|
3.70 (0.57–23.00)
|
0.170
|
10
|
12.00 (0.93–150)
|
0.057
|
10
|
PAAD
|
1.11 (1.03–1.19)
|
2.00 (1.40–2.70)
|
***
|
184
|
2.10 (1.50–3.00)
|
***
|
181
|
PRAD
|
0.96 (0.93–0.99)
|
0.81 (0.34–2.00)
|
0.650
|
484
|
0.76 (0.32–1.80)
|
0.540
|
484
|
READ
|
1.05 (0.99–1.12)
|
5.40 (1.60–18.0)
|
*
|
94
|
3.00 (0.78–11.00)
|
0.110
|
85
|
STAD
|
1.08 (1.00–1.18)
|
1.20 (0.91–1.50)
|
0.230
|
393
|
1.20 (0.91–1.50)
|
0.250
|
382
|
THCA
|
1.11 (0.99–1.25)
|
1.20 (0.67–2.10)
|
0.530
|
502
|
1.40 (0.74–2.60)
|
0.310
|
500
|
UCEC
|
2.98 (2.63–3.39)
|
0.89 (0.65–1.20)
|
0.480
|
425
|
1.00 (0.76–1.40)
|
0.860
|
425
|
UCS
|
1.63 (1.55–1.71)
|
1.00 (0.69–1.50)
|
0.880
|
57
|
1.20 (0.80–1.90)
|
0.340
|
57
|
Abbreviations: body mass index (body mass index), confidence interval (CI), deoxyribonucleic acid methylation (DNAm), number of patients (N), hazards ratio (HR), odds ratio (OR) and tumour-node-metastasis (TNM) |
1 premenopausal breast cancer |
2 postmenopausal breast cancer |
Significant HR associations are shown in bold |
* p < 0.01, ** p < 0.001, *** p < 0.0001 |
Table 3
Summary of DNAm smoking exposure cancer survival and reported exposure cancer risks. For 24 cancer types, the DNAm smoking exposure associated HRs for cancer survival were calculated from univariable and multivariable Cox proportional hazard model analysis and the reported smoking exposure associated ORs for cancer risk were gathered from meta-analyses in the literature. DNAm smoking z-scores were used as the dependent variable in the univariable and multivariable Cox proportional hazards models, with the multivariable HR analysis also adjusted for age at diagnosis, TNM stage (where applicable) and DNAm BMI exposure scores.
Smoking
|
Cancer
|
Cancer risk
|
Univariable - survival
|
Multivariable - survival
|
|
OR (95% CI)
|
HR (95% CI)
|
P-value
|
N
|
HR (95% CI)
|
P-value
|
N
|
BLCA
|
3.29 (2.61–4.15)
|
1.10 (0.96–1.30)
|
0.140
|
409
|
1.20 (1.00–1.50)
|
0.034
|
407
|
BRCA
|
1.13 (1.04–1.22)
|
0.95 (0.75–1.20)
|
0.640
|
774
|
0.96 (0.76–1.20)
|
0.740
|
763
|
CESC
|
2.03 (1.49–2.57)
|
1.10 (0.84–1.40)
|
0.490
|
299
|
1.10 (0.84–1.50)
|
0.460
|
292
|
CHOL
|
1.45 (1.11–1.88)
|
1.30 (0.64–2.80)
|
0.440
|
36
|
1.50 (0.65–3.20)
|
0.360
|
36
|
COAD
|
1.25 (1.14–1.37)
|
0.92 (0.69–1.20)
|
0.530
|
290
|
0.86 (0.63–1.20)
|
0.320
|
280
|
DLBC
|
1.16 (0.98–1.37)
|
2.30 (1.10–4.70)
|
0.021
|
47
|
2.70 (1.10–6.50)
|
0.028
|
41
|
ESCA
|
3.10 (2.68–3.58)
|
0.92 (0.73–1.20)
|
0.520
|
174
|
0.97 (0.74–1.30)
|
0.820
|
168
|
GBM
|
1.08 (0.94–1.25)
|
0.80 (0.56–1.20)
|
0.240
|
124
|
0.95 (0.65–1.40)
|
0.770
|
124
|
HNSC
|
4.83 (3.72–6.29)
|
0.97 (0.82–1.10)
|
0.730
|
523
|
1.00 (0.87–1.20)
|
0.680
|
523
|
KICH
|
2.10 (1.77–2.50)
|
0.82 (0.48–1.40)
|
0.480
|
66
|
0.99 (0.55–1.80)
|
0.980
|
66
|
KIRC
|
2.10 (1.77–2.50)
|
0.58 (0.46–0.74)
|
***
|
316
|
0.70 (0.54–0.90)
|
*
|
314
|
KIRP
|
2.10 (1.77–2.50)
|
0.68 (0.49–0.94)
|
0.021
|
274
|
0.90 (0.62–1.30)
|
0.590
|
256
|
LGG
|
1.08 (0.94–1.25)
|
1.20 (0.89–1.70)
|
0.200
|
504
|
1.10 (0.81–1.60)
|
0.460
|
504
|
LIHC
|
1.52 (1.24–1.85)
|
1.10 (0.92–1.20)
|
0.420
|
375
|
1.00 (0.88–1.20)
|
0.760
|
351
|
LUAD
|
21.40 (19.7–23.2)
|
0.96 (0.79–1.20)
|
0.690
|
455
|
0.97 (0.80–1.20)
|
0.800
|
451
|
LUSC
|
21.40 (19.7–23.2)
|
1.20 (1.00–1.50)
|
0.040
|
365
|
1.20 (1.00–1.50)
|
0.049
|
362
|
OV
|
1.04 (0.95–1.15)
|
1.00 (0.38–2.60)
|
1.000
|
10
|
170 (0.21–15000)
|
0.130
|
10
|
PAAD
|
2.30 (2.08–2.53)
|
0.93 (0.72–1.20)
|
0.600
|
184
|
1.30 (0.94–1.80)
|
0.120
|
181
|
PRAD
|
0.85 (0.77–0.95)
|
0.26 (0.05–1.20)
|
0.091
|
484
|
0.26 (0.05–1.40)
|
0.120
|
484
|
READ
|
1.25 (1.14–1.37)
|
0.86 (0.52–1.40)
|
0.570
|
94
|
1.00 (0.60–1.70)
|
0.990
|
85
|
STAD
|
2.00 (1.67–2.39)
|
1.30 (1.00–1.60)
|
0.022
|
393
|
1.30 (1.10–1.70)
|
0.016
|
382
|
THCA
|
0.52 (0.35–0.78)
|
0.57 (0.28–1.20)
|
0.120
|
502
|
0.49 (0.21–1.10)
|
0.099
|
500
|
UCEC
|
0.75 (0.58–0.95)
|
0.88 (0.68–1.10)
|
0.330
|
425
|
1.00 (0.79–1.30)
|
0.840
|
425
|
UCS
|
0.83 (0.65–1.04)
|
1.10 (0.74–1.60)
|
0.690
|
57
|
1.10 (0.75–1.60)
|
0.640
|
57
|
Abbreviations: body mass index (body mass index), confidence interval (CI), deoxyribonucleic acid methylation (DNAm), number of patients (N), hazards ratio (HR), odds ratio (OR) and tumour-node-metastasis (TNM) |
Significant HR associations are shown in bold |
* p < 0.01, ** p < 0.001, *** p < 0.0001 |
For the full, pre-menopausal and post-menopausal DNAm BMI associated HR analyses, it was also found that DNAm BMI, age and late TNM stage were all significant predictors of survival for the full and post-menopausal BRCA groups. No variables were significant predictors of survival for the pre-menopausal BRCA group (Table 4). Furthermore, in the subsequent analyses, ovarian cancer was excluded due to low patient numbers.
Table 4
DNAm BMI exposure survival analysis for breast cancer groups. The DNAm BMI exposure associated HRs were calculated by multivariable Cox proportional hazards analysis with adjustment for age at diagnosis, TNM stage (where available) and DNAm smoking exposure score, for the full, pre-menopausal and post-menopausal breast cancer groups.
Variables
|
BRCA full
|
BRCA pre-menopausal
|
BRCA post-menopausal
|
HR
|
P-value
|
N
|
HR
|
P-value
|
N
|
HR
|
P-value
|
N
|
BMI1
|
1.60 (1.25–2.00)
|
**
|
774
|
1.84 (0.95–3.60)
|
0.069
|
168
|
1.44 (1.06–2.00)
|
0.018
|
495
|
Smoking1
|
0.96 (0.76–1.20)
|
0.741
|
774
|
0.84 (0.41–1.70)
|
0.634
|
168
|
0.92 (0.69–1.20)
|
0.589
|
495
|
Age
|
1.04 (1.02–1.10)
|
***
|
774
|
0.95 (0.86–1.00)
|
0.229
|
168
|
1.05 (1.03–1.10)
|
***
|
495
|
Stage I & II
|
1.00 (reference)
|
|
555
|
1.00 (reference)
|
|
112
|
1.00 (reference)
|
|
369
|
Stage III & IV
|
2.77 (1.83–4.20)
|
***
|
208
|
1.77 (0.52–6.00)
|
0.362
|
53
|
2.98 (1.78–5.00)
|
***
|
123
|
Abbreviations: body mass index (BMI), breast cancer (BRCA), deoxyribonucleic acid methylation (DNAm), |
number of patients (N), and hazards ratio (HR) |
1 DNAm exposure z-score |
Significant HR associations are shown in bold |
* p < 0.01, ** p < 0.001, *** p < 0.0001 |
For each exposure the relationship between DNAm exposure associated cancer survival and reported exposure associated cancer risk in the 23 cancer types are shown in Fig. 4A. The DNAm exposure associated HRs and the reported exposure associated ORs for cancer survival and risk respectively, was significantly associated for the alcohol exposure (p = 0.022), and not significantly associated for the BMI and smoking exposures (p = 0.548, p = 0.193 respectively). The cancer types that had significant DNAm exposure associated HRs for cancer survival are shown with their reported exposure associated ORs for cancer risk in Fig. 4B. For the DNAm exposures and cancers that were significantly associated with survival for; kidney (KIRP), esophageal (ESCA) and head and neck (HNSC) cancers for higher alcohol consumption, pancreatic (PAAD) and post-menopausal breast (BRCA) cancers for higher BMI, and stomach (STAD), kidney (KIRC), bladder (BLCA) and lung (LUSC) cancers for smoking exposures; their corresponding reported exposures were also associated with cancer risk, usually in the same direction. While for the DNAm exposures and cancers that were significantly associated with cancer survival for; bladder (BLCA) and brain (LGG) cancers for higher alcohol consumption, bladder (BLCA) cancer for higher BMI, and B-cell lymphoma (DLBC) cancer for smoking exposures; their corresponding reported exposures were not associated with cancer risk. Interestingly, the reported smoking exposure increased the risk of developing kidney (KIRC) cancer, but DNAm smoking exposure appeared to be protective in terms of prognosis.