Pan-cancer promoter hyper-methylation analysis of Protein Phosphatase Enzymes and Interacting Proteins.
To identify epigenetic defects in Protein Phosphatase Enzymes and Interacting Proteins (PPEIP), we studied the DNA methylation status at promoter regions of 523 PPEIP genes (Supplementary data S1) in five cancer subtypes (Colorectal, Esophageal, Lung, Pancreatic and Stomach cancers) across three cell models; primary tumors (The Cancer Genome Atlas (TCGA), cancer cell lines [31] and 3D embedded cancer cell cultures or organoids [35]. DNA Methylation profiling for 705 cancer samples was performed using the Illumina Infinium Human Methylation 450 BeadChip (450K) and compared against a baseline control cohort comprising 42 unrelated healthy individuals from the same five tissue types mentioned above (epimutations). A list of all datasets, sample IDs and primary tissues can be found in Supplementary data S2 (also see methods). A simplified overview of the analysis workflow is provided in Figure 1.
The 450K platform allows for genome-wide CpG methylation analysis at 485,000 CpG’s located at various genomic regions including 98% of all promoters in refseq-annotated genes. In order to dispel potential probe representations issues, we first analyzed the distribution of probes at PPEIP promoters against all other 450K represented genes. 6296 450K probes informed CpG DNA methylation levels in PPEIP promoters (12 probes per gene) and 3260 (52%) of those, in CpG islands associated with PPEIP promoters (6 probes per CpG island) (Figure 2A). In comparison, 168,664 probes were annotated to all gene promoters (9 probes per gene) and 79,008 (47%) to all promoter-associated CpG islands (4 probes per gene) indicating a higher representation of probes for PPEIP gene promoters compared to all other gene promoters (Figure 2A). The definition of “promoter probes” were those probes that provided semi-quantitative methylation values for CpG’s located within 1500bp and 200bp of the transcription start site (TSS1500, TSS200), 5’ untranslated region (5’UTR) or 1st exon of each gene. The majority of PPEIP gene promoter probes were found in the TSS1500 (30%) and the least in the 1st exon (13%) (Figure 2B). Half of all PPEIP probes were distributed between the 5’UTR (29%) and the TSS200 (21%) (Figure 2B). The median number of probes per gene was 11, and 7 when considering only CpG islands in PPEIP promoters (Figure 2C). The Protein Tyrosine Phosphatase gene PTPMT1 was the PPEIP with the most probes associated with its promoter (n = 60).
PPEIP promoter hyper-methylation profiles reveal distinct tissue susceptibility and frequency associated with cancer.
Overall, 5007 hypermethylated Protein Phosphatase Enzymes and Interacting Proteins (PPEIP) gene promoters (epimutations) were identified in 593 cancer samples (84%), a median of 4 cancer-associated epimutations per individual. The distribution of epimutations across samples was significantly imbalanced (P = 2.2 10-16). Stomach cancer patients demonstrated the highest percentage of PPEIP epimutation distribution of 27% however, stomach cancer case only made up 18% of the total sample number (Supplementary Figure 1A). This represented the highest ratio of epimutation vs sample share, or observed epimutation distribution ratio (OEDR) of 0.59 (expected OEDR = 0) (Supplementary Figure 1B). Lung cancer patients showed with the lowest epimutation share with 9%, although 22% of all samples were from lung malignancies (-1.25 OEDR) (Supplementary Figure 1A and B). Colorectal, esophageal and stomach cancers showed positive OEDR (0.59 – 0.25) while lung and pancreatic cancer patients showed negative OEDR (-1.25 and -0.68 respectively) (Supplementary Figure 1B). 593 cancer patients (84%) presented with at least one hyper-methylated PPEIP promoter in 160 (31%) PPEIP genes analyzed as compared with healthy controls. Esophagus represented the cancer tissue with the highest number of individuals with at least one epimutation (98%) (Supplementary Figure 2A). This slight inconstancy with the OEDR stomach cancer data can be explained by the individual epimutation frequency in both tissues. 44% of all epimutations detected in the top 10% of individuals with the most epimutations accumulated were attributed to stomach cancer patients compared to only 28% in the esophagus (Supplementary data S3). This suggests that a subset of stomach cancer patients accumulate higher quantities of epimutations in fewer individuals as compared with esophageal cancer patients where epimutations are procured less inter-individually but more consistently across individuals. Lung cancer patients presented with the least number of epimutations (67%) consistent with OEDR data (Supplementary Figure 2A). 96% of all organoid samples harbored at least one hyper-methylated PPEIP promoter compared to 86% and 77% of all cancer cell lines and primary tumors respectively (Supplementary Figure 2B). To assess the data further, we applied our epimutation detection pipeline to 450K data from a separate test cohort of 47 control individuals presenting the five analyzed tissues to identify epimutations in a population of healthy individuals. 11 individuals (23%) were identified carrying 79 hyper-methylated PPEIP promoters as compared to 84% in cancer samples. A median epimutation count of 0 per individual was observed (Supplementary Figure S3A) as compared to 4 in the same tissues in a cancer context (Supplementary Figure S3B). Epimutations we detected were overwhelmingly enriched in cancer patients as compared to healthy individuals (P = 7.02 x10-18). Interestingly, epimutations detected in healthy samples were only observed in two tissues (Esophagus n=7 and Pancreas n=4 individuals) (Supplementary data S4). This gave us confidence that the epimutations we identified through our bespoke bioinformatic pipeline were cancer-associated.
Next, we examined the frequency of PPEIP-associated epimutations in cancer patients. 5007 epimutations were detected in 160 PPEIP genes (Supplementary data S5). Of the 160 PPEIP genes, 88 (55%) were considered “rare” (observed in <1% of all cancer cases) and 41 (26%) as “recurrent” (identified in >5% of cancer samples) (Supplementary data S5). Many recurrent genes were known tumor suppressors with previously described cancer-related promoter hypermethylation anomalies (PTPN13, DUSP5, [2] PPP1R14A [26], PPP1R3C [27], PTPRM [28] and IGFBP3 [42–44] validating the robustness of our approach. We also detected several PPEIP genes with previously undescribed recurrent methylation changes (Figure. 2D). INPP5B (Inositol Polyphosphate-5-Phosphatase) is an anti-apoptotic protein with a proliferative role in different cancer types [45, 46] and epimutations were observed in 130 individuals in all 5 tissues (Figure 2D). Proline-serine-threonine phosphatase interacting protein 2 or PSTPIP2 promoter hyper-methylation was observed in 37 cases in all 5 tissues examined (Figure. 2D). PSTPIP2 is required for correct cell cycle function and dysregulation of PSTPIP2 contributes to abnormal proliferation and terminal differentiation in megakaryocytes [47]. In addition to recurrent differentially methylated promoters, several patients displayed rare, previously undescribed epimutations at PPEIP promoters. For example, an epimutation was detected at the promoter of PPP3CC (Protein Phosphatase 3 Catalytic Subunit Gamma) in 3 individuals, colorectal (n=1) and stomach (n=2) cancer. PPP3CC repression contributes to invasion and growth of glioma cells [48]. As loss-of-function genetic mutations in PPP1R3B gene have been associated with lung cancer [49], similarly, DNA methylation associated transcription silencing mimic loss-of-function properties. We observed one individual with esophageal cancer phenotype harboring an epimutation at the PPP1R3B promoter (Figure. 2D).
We examined highly epimutated PPEIP genes and their described roles in human malignancies. Collectively, PTPRT, CDH2, EYA4, SLITRK5, NTRK3, ADCY8, DNAJC6, PPM1E, FBP2 and GRIN3A were identified as those genes where most cancer individuals were observed to harbor hyper-methylated PPEIP promoters. An even distribution of epimutation count was not observed across all cancer tissue types consistent with our OEDR data. A breakdown of this is presented in Figure 2E. PTPRT was the most epimutated PPEIP with 43% of all cancer cases showing epimutations in this gene. PTPRT is a tyrosine phosphatase with a previously described role as a tumor suppressor in colorectal cancer [13]. Interestingly, the authors also demonstrated that the most frequent genetically mutated tyrosine phosphatase gene was PTPRT in their colorectal cancer (CRC) cohort. In line with this finding, PTPRT was also the most epimutated gene in our CRC cohort; 76% of all CRC patients carried an epimutation. On the contrary, only 21% and 19% of all lung and pancreatic cancer patients respectively carried an epimutation in PTPRT (Figure 2E). Several of the highest epimutable genes have disparate prevalence of hypermethylated promoters between cancer tissue types. Eyes absent 4 (EYA4) is a threonine-tyrosine phosphatase [50] previously described as a tumor suppressor in multiple cancers examined in this study (CRC [51], esophagus [52], lung [53] and pancreas [54]). EYA4 promoter DNA methylation has been reported to be negatively correlated with gene expression and plays an important role in cell proliferation inhibition via Wnt and MAPK signaling pathways [55]. The frequency of epimutation is highly contrasting as 59% of CRC patients carry hypermethylated EYA4 promoters compared with only 13% of lung cancer patients (Figure 2E). Another example is the NTRK3 gene. NTRK3 has been described in important cancer related pathways that promote both survival and cell proliferation and so its role as an oncogene [56] and tumor suppressor [57] is not unexpected. We observed at least a twofold difference in individuals with NTRK3 hyper-methylated promoters between lung cancer (15%) and the other 4 cancers (CRC; 36%, stomach; 34%, Esophagus; 33% and Pancreas 30%) (Figure 2E).
Pan-cancer promoter hyper-methylation of PPEIP affect cellular pathways and networks that favor tumor success.
To further decipher the role of the 160 cancer-associated promoter hypermethylation susceptible PPEIP genes, we performed an enrichment analysis for gene networks, cellular pathways and transcription factor (TF) binding (Figure 3). For gene network and cellular pathway analysis, three highly cited software were used; Biocarta [58], Kyoto Encyclopedia of Gene and Genomes (KEGG) [59] and curated WikiPathways [60]. All three software demonstrated high overlap of well-known pathways described in cancer cells such PI3K-AKT [61], MAPK [62] and cellular metabolism [63] (Figure 3A,C,D). All three present actionable targets for anti-cancer drugs [64–66]. Other interesting pathways include angiogenesis related VEGFA-VEGFR2 signaling [67], regulatory circuits of STAT3 signaling pathways [68] as well as gene networks involved in aging [69]. We also interrogated transcription factor (TF) targets computed from ChIP-seq data from the ENCODE project [70]. The genes most affected by promoter methylation are also targets for TF that are highly mutated in cancer such as chromatin remodelers (EP300, HDAC2, KDM4A among others) (extensively reviewed in [71]), cell cycle regulator (SIN3A [72]) and cell proliferation (YY1 [73]) (Figure 3B).
Pan-cancer examination of epimutations in protein tyrosine and dual specific phosphatases exhibit aberrations in transcriptomic profiles affecting key cellular networks related to cancer.
A number of Protein Tyrosine Phosphatases (PTPs) have been described in human cancers as tumor suppressor genes [2] and among the most studied include PTPRM; [74, 75], PTPN13; [17] and PTPRG; [76]. Therefore, in primary tumors, PTPs would represent a subset of key genes where the effects of promoter DNA methylation induced transcriptional silencing are detrimental. In this regard, we closely examined PTPs for methylation sensitivity in primary tumors, cancer cell lines and 3D embedded cell cultures (organoids). We also performed the same analysis in dual-specificity phosphatases (DUSP) given their activity and role in cancer [77]. Only few individuals presented hyper-methylated promoters for serine / threonine phosphatases (<2%) and therefore we focused our attention on PTP and DUSP genes. A list of PTPs and DUSPs was compiled from Ensembl and DNA methylation profiles generated for 43 PTP and 24 DUSP genes (Supplementary data S6). 17 PTP (40%) and 7 (29%) DUSP were observed with hyper-methylated promoters in 410 (57%) and 102 (14%) cancer cases respectively. Colorectal cancer (CRC) patients demonstrated the highest number of PTP epimutations (81%) and pancreatic cancer patients the lowest (27%) (Figure 4A). Stomach cancers presented the highest number of individuals with DUSP epimutations (23%) with lung and pancreatic malignancies the least (4%) (Figure 4B). Further analysis into the role of PTPs and DUSP revealed PTPRT is the most ubiquitously epimutated PTP, 43% of all cancer individuals showed hypermethylated PTPRT promoters. DNAJC6 (23%) and PTPRM (16%) were the second and third most epimutated PTP. PTPRT was also found to be the most epimutated PTP in 4 of the 5 cancer tissues analyzed (CRC = 78%, Stomach = 55%, Esophagus = 42% and Pancreas = 19%) and second most epimutated in Lung (21%). DNAJC6 was the most epimutated PTP in Lung (43%) (Figure 4C). CRC, stomach and esophageal cancer showed overall high levels of individuals with PTP hyper-methylated promoters. Of the top 10 most epimutated PTPs, Stomach (9/10), Esophageal (7/10) and CRC (6/10) showed >5% of individuals with epimutations in PTPs. Pancreas and Lung (2/10) presented with low epimutated PTPs (Figure4C). To a lesser extent, DUSP genes were also found to be highly epimutated (Figure 4B). DUSP26 was the gene with the highest number of individuals with hyper-methylated promoters (10%), with DUSP5 (6%), DUSP23 (3%) and DUSP15 (2%) also showing epimutations (Figure. 4D). CRC individuals showed the highest number of epimutated DUSP26 (19%) followed by Stomach (14%) and esophagus (9%). Again, lung and Pancreas showed the least (4% and 1% respectively). DUSP5 was the most epimutated DUSP in Stomach (17%), Pancreas (2%) and equal to DUSP26 in Esophageal cancer (9%) (Figure 4D).
Next, we analyzed the effect of epimutations on PTP and DUSP gene expression. 17 PTP and 7 DUSP were detected to contain at least one cancer-associated hypermethylated promoter. An initial analysis of gene expression in normal tissues was conducted using the GTEx portal (gtex.org, [78]) in the five tissues analyzed in this study to determine the potential effect of cancer-associated epimutations on PTP and DUSP transcription. 9 of 17 PTPs and 5 of 7 DUSP were observed to be expressed at high levels in at least one healthy tissue type (>5 TPM) (Supplementary Figure 4A and B). Further investigation revealed that 4 PTP (PTPRM, PTPN13, PTPRG and PTPRB) and 3 DUSP (DUSP23, DUSP5 and DUSP2) were ubiquitous epigenetic outliers in all cancer cell models, highly expressed in at least one normal tissue (>5 TPM) and maintained expression in their malignant counterpart prior to segregation based on promoter DNA methylation (Supplementary Figure 5). Expression profiles from primary tumors (TCGA) and cancer cell lines (CCLE) for the 4 PTP and 3 DUSP are presented in Figure 5. In each boxplot, gene expression is partitioned by individuals with hypermethylated promoters (>0.33 average promoter beta value in TCGA and > 0.66 in CCLE vs healthy controls) (see methods for details). A significant negative correlation between promoter methylation and gene expression was observed in all genes and cell models. We expect these data to also be representative of cancer organoid cell models [35]. Although hypermethylated promoters negatively correlate with gene expression in our cancer cohort (Figure 5), one exception was observed in the gene DNAJC6, where an increase in promoter methylation was positively correlated with gene expression (Supplementary Figure 6A and C). DNAJC6 is the second most epimutated gene in all cancer samples and the most epimutated gene in the lung cancer cohort (43%) (Figure 2 and 4). Although a high number of epimutations were identified in our analysis cohort, several showed very low expression in their pertinent tissues. As mentioned above, PTPRT is the most epimutated gene found in this study however its expression is extremely low (or non-existent) in the 5 tissues analyzed (Supplementary Figure 6B and D) with the highest expression observed in brain tissue (Supplementary Figure 4A). This interesting finding demonstrates that epigenetic dysregulation in cancer cells occurs independent of an active transcriptional program and may provide important genomic information for other tissues.
PTPRM epigenetic transcriptional silencing correlates with poor clinical outcome and reduced anti-cancer drug sensitivity.
The role of PTP as tumor suppressors has been described in great detail in a number of tissues [2] and are targets for anti-cancer therapy [79]. In our cancer cohort, PTPRM represented the PTP with highest number of individuals with epimutations and high median expression values in healthy subjects (>5 TPM). Having demonstrated the presence of PTPRM promoter hyper-methylation associated transcriptional silencing (Figure 5A and B), we studied if PTPRM epigenetic loss in cancer patients had any impact on the clinical outcome in these patients. For this, we leveraged complete clinical and transcriptomic data from all individuals used in this study available in TCGA data repository. We identified that PTPRM gene silencing (top 25% quartile expression vs bottom 25%) in pancreatic cancer patients showed a significant association with poor overall survival probability (Log Rank P = < 0.05, hazard ratio [HR] = 1.82; 95% confidence interval [CI] = 0.995 – 4.336; P = < 0.05).
PTPRM is an important component of STAT3 regulation with downstream effects on proliferation and metastasis in lung cancer [80] therefore we speculated whether PTPRM epigenetically deficient cancer cells could be exploited for therapeutic purposes, specifically drugs that target the JAK/STAT cell signalling pathway in other tissues. We downloaded IC50 concentration Z scores from GDSC2 (Genomics of Drug Sensitivity in Cancer, dataset2) database [31] for antitumor drugs that target key cancer-related cellular pathways (Figure 6B). We identified 4 compounds, AZ960 [81], JAK8517, JAK18709 [82] and Ruxolitinib [83], that specifically target proteins in the JAK/STAT pathway and compared their drug sensitivity to PTPRM promoter methylation status in pancreatic cancer cell lines. We observed that drug sensitivity was significantly proportional to DNA methylation levels of the PTPRM gene promoter in pancreatic cancer cell lines (rho = 0.3, P = 0.00462) (Figure 6C) suggesting that PTPRM promoter methylation profiles maybe used as a potential biomarker for clinical treatment response in pancreatic cancer patient therapy.