DNA methylation patterns suggest the involvement of DNMT3B and TET1 in osteosarcoma development

DNA methylation may be involved in the development of osteosarcomas. Osteosarcomas commonly arise during the bone growth and remodeling in puberty, making it plausible to infer the involvement of epigenetic alterations in their development. As a highly studied epigenetic mechanism, we investigated DNA methylation and related genetic variants in 28 primary osteosarcomas aiming to identify deregulated driver alterations. Methylation and genomic data were obtained using the Illumina HM450K beadchips and the TruSight One sequencing panel, respectively. Aberrant DNA methylation was spread throughout the osteosarcomas genomes. We identified 3146 differentially methylated CpGs comparing osteosarcomas and bone tissue samples, with high methylation heterogeneity, global hypomethylation and focal hypermethylation at CpG islands. Differentially methylated regions (DMR) were detected in 585 loci (319 hypomethylated and 266 hypermethylated), mapped to the promoter regions of 350 genes. These DMR genes were enriched for biological processes related to skeletal system morphogenesis, proliferation, inflammatory response, and signal transduction. Both methylation and expression data were validated in independent groups of cases. Six tumor suppressor genes harbored deletions or promoter hypermethylation (DLEC1, GJB2, HIC1, MIR149, PAX6, and WNT5A), and four oncogenes presented gains or hypomethylation (ASPSCR1, NOTCH4, PRDM16, and RUNX3). Our analysis also revealed hypomethylation at 6p22, a region that contains several histone genes. Copy-number changes in DNMT3B (gain) and TET1 (loss), as well as overexpression of DNMT3B in osteosarcomas provide a possible explanation for the observed phenotype of CpG island hypermethylation. While the detected open-sea hypomethylation likely contributes to the well-known osteosarcoma genomic instability, enriched CpG island hypermethylation suggests an underlying mechanism possibly driven by overexpression of DNMT3B likely resulting in silencing of tumor suppressors and DNA repair genes.

Anomalous DNA methylation is also a cancer hallmark affecting metastasis and resistance to chemotherapy in patients with osteosarcomas. Promoters of CDKN2A (Oh et al. 2006;Sonaglio et al. 2013), RASSF1, MGMT, DAPK1, TIMP3 (Hou et al. 2006), HIC1, and GADD45 (de Azevedo et al. 2020), are hypermethylated in a proportion of cases in comparison to normal bone tissues as well as hypomethylation of IRX1, SEMA4D, RAF1, and PAK1 (de Azevedo et al. 2020). A combined analysis of DNA methylation from osteosarcoma samples and gene expression from osteosarcoma cell lines disclosed an association between PRAME and SEMA3A hypomethylation and increased expression levels as well as CALD1 and DNALI1 hypermethylation with gene repression in osteosarcomas (Tian et al. 2018). In xenografts, increased IL-6 expression due to DNA demethylation mediated by TET2 within the tumor microenvironment increased the metastatic capability of osteosarcoma cells (Itoh et al. 2018). The potential to form metastasis was also associated with a non-random pattern of the enhancer histone marks H3K4me1 and H3K27ac in metastatic and nonmetastatic osteosarcomas (Morrow et al. 2018).
The genome-wide methylation of osteosarcomas (OS) compared to normal bone tissues could provide information regarding the underlying mechanisms involved with their development. OS are known to be genetically unstable, a characteristic that points to a genome-wide DNA methylation dysregulation; however, some features may be shared by these tumors as they have similar characteristics across different patients. Thus, we explored methylomes of primary tissues of osteosarcomas and the possible link with somatic genetic variants, aiming to provide insights into the mechanisms underlying its tumorigenesis. We highlighted genes, biological processes and signaling pathways, including metabolic changes in the blood, which might be involved with osteosarcoma development.

Characteristics of the patients
Our cohort consisted of patients diagnosed with osteosarcoma between 2008 and 2014, with fresh-frozen samples stored at the biobank from the Barretos Cancer Hospital (SP, Brazil). Samples from 28 osteosarcomas (14 females and 14 males) were obtained before any systemic treatment (chemotherapy or radiotherapy), as well as nine non-paired normal bone tissues from surgeries of non-cancer patients. Pediatric patients were aged between 10 and 20 years (n = 23) and five patients were adults (29-61 years) ( Table 1). The normal bone samples came from patients that were diagnosed with cancers other than osteosarcoma, and they were assessed for non-tumoral or inflammatory cells by the pathologist (five males, four females). The treatment of patients followed the clinical protocol of the Brazilian Osteosarcoma Treatment Group (Uzan et al. 2016). The conduction of this study followed national and institutional ethical policies with approval from the Barretos Cancer Hospital Ethical Committee (CEP-HCB 898.403). Informed consent was obtained from the patients and/or their legal guardians.

Samples preparation and quality control.
Tumor (OS) and normal (BT) samples were microdissected and reviewed by a pathologist, who selected only areas containing at least 70% of cells of interest for the study. DNA was isolated using the QIAsymphony equipment with the DSP DNA Mini Kit (Qiagen). Quantity and quality were assessed by NanoDrop (Thermo Fisher Scientific) and agarose gel 0.8%, respectively. DNA was bisulfite converted using the EZ DNA Methylation kit (Zymo Research Corp), according to the manufacturer's recommendations, and hybridized in the Human Methylation 450 BeadChip microarrays (HM450K, Illumina), following the Illumina Infinium HD methylation protocol. Raw data were extracted using the iScan SQ scanner (Illumina) by GenomeStudio software (v.2011.1), with the methylation module v.1.9.0 (Illumina), into IDAT files, which were used for further analyses.
Methylation data are available in NCBI Gene Expression Omnibus (GEO) under accession number GSE193000.

Preprocessing and identification of differentially methylated genes
The R package ChAMP (Morris et al. 2014;Tian et al. 2017) was applied to the dataset and all 37 samples passed QC parameters. We used the standard pipeline: filtered out probes included those with a bead count < 3 in at least 5% of samples (n = 786 probes), located in non-CpG sites (n = 3,047 probes) or in SNPs (n = 49 977 probes), as well as probes that aligned to multiple locations (n = 7 143) or located in the XY chromosomes (n = 10 046). A total of 410 617 probes were retained for further analyses. Signal intensities from probes type I and II were normalized using the updated version of BMIQ method (Teschendorff et al. 2013).
The singular value decomposition method (SVD) assessed unwanted components of variation, which was corrected using ComBat (Johnson et al. 2007) (Supplemental Figure S1). To avoid the effects of cell type composition, the R package RefFreeEWAS (Houseman et al. 2014) was applied to β values, as implemented in ChAMP, resulting in the identification of differentially methylated CpG positions (DMPs) between OS and BT samples. DMRcate (Peters et al. 2015) was applied to identify differentially methylated regions (DMRs), defined as at least seven consecutive CpG sites presenting methylation changes in the same direction (hypo-or hypermethylation) in a DNA sequence of 300 nucleotides. DMRs with FDR < 0.05 and methylation differences > 20% were considered significant.
To evaluate the global methylation heterogeneity within the tumor samples, we calculated the coefficient of variation (CV), through the ratio between standard deviation/mean of 1 3 DNA methylation levels across the genome (Sheffield et al. 2017).

Enrichment analysis using the sets of differentially methylated genes
Functional enrichment analyses were applied to genes mapped within the DMRs as input data and the wholegenome as background using Gene Set Enrichment Analysis (GSEA, (Subramanian et al. 2005)). We decided to evaluate the set of genes mapped to DMRs as the possible impact has been more widely explored in the literature. Genes were clustered based on biological processes and molecular functions (Gene Ontology), pathways (Reactome), transcription factor targets, and chemical and genetic perturbations. Features with adj P ≤ 0.05 (Benjamini-Hochberg adjustment) that comprised at least ten genes were considered statistically significant.
Analysis of copy-number alterations (CNA) was performed using the Nexus Copy Number 9.0 software (Bio-Discovery, California, USA), with healthy control samples as references. Identification of CNAs was based on a sample/ reference ratio threshold log 2 ≥ 0.25 for gains, ≤ − 0.25 for losses, ≥ 1.2 for high copy gains and ≤ − 1.2 for homozygous copy losses. Common germline CNAs based on DGV data (http:// dgv. tcag. ca/ dgv/ app/ home) were disregarded. All the variants called by the software algorithm were visually inspected for validation.

Gene expression analysis by qRT-PCR
The expression levels of selected genes of interest were evaluated by quantitative real-time PCR (qRT-PCR) in cDNA derived from eight primary osteosarcoma samples and three normal bone samples, using the Fast SYBR Green technology (Applied Biosystems) with the 2-ΔΔCq method (Pfaffl 2001). Cycling conditions were: 10 min at 95 °C, followed by 40 cycles of 15 s at 95 °C and 1 min at 60 °C, and a final stage of 15 s at 95 °C, followed by 1 min at 60 °C and 15 s at 95 °C. Data were normalized to the expression of the housekeeping gene GAPDH. All reactions were performed in duplicates in a total volume of 20 mL. The primers presented an efficiency close to 100% (± 10%).

Osteosarcomas present high inter-sample heterogeneity of DNA methylation and a global hypomethylation pattern
Differences between OS and BT samples were characterized using the methylation levels of 410 617 CpG sites. Multidimensional scaling (MDS) using the top 1% most variable CpG sites showed that BT samples grouped tightly together, while the OS group had high inter-sample variability ( Fig. 1A; Supplemental Figure S2). OS samples presented a high level of heterogeneity, with a median CV of 0.77, compared to a median CV of 0.73 obtained in BT controls (t test; p value = 0.015). This heterogeneity is unlikely to be related to morphologic subtypes, tumor content, or purity, as data were corrected for such variables. Methylation levels of OS and BT were compared in a linear Bayesian framework model, revealing 3 146 differentially methylated positions (DMPs; q value < 0.01 with methylation differences (DB) > 10%; Supplemental Table S1). Unsupervised hierarchical clustering based on these 3 146 DMPs (Pearson correlation with complete linkage) discriminated BT controls from 27 out of 28 OS samples (Supplemental Figure  S3). The clustering did not show significant association with clinical parameters, such as histology, Huvos grade, or metastasis at diagnosis (Fig. 1B).
Considering the genomic location, OS displayed a global hypomethylation profile (Fig. 1C) and a non-random DMPs distribution, with most hypermethylated CpG sites located at CpG islands, whereas hypomethylated CpG sites were mapped in open-sea regions, with no changes in shores and shelves (p < 0.0001, Chi-square distribution test, Fig. 1D; Supplemental Table S1). The methylation status of CpG islands was not related to clinical features, suggesting a mechanism inherent to all OS samples rather than specific to OS characteristics or subtypes.
To validate our data, we interrogated the 3146 DMPs in an available independent set (GSE125645) of 28 osteosarcomas and two bone tissue samples. Unsupervised hierarchical clustering fully discriminated tumor from normal samples, validating our findings (Supplemental Figure  S4).

DMR genes are involved with bone development and cancer hallmarks
From the 3146 DMPs, 2018 CpG sites were located in or next to 1489 genes. Since DNA methylation changes can affect multiple neighboring CpG sites, we searched for differentially methylated regions (DMRs) and identified 319 hypomethylated and 266 hypermethylated DMRs (FDR < 0.0001, mean methylation differences > 20%) mapped to the promoter regions of 172 and 178 genes, respectively (Supplemental Table S2).
The DMR genes were enriched (adjP < 0.05) for biological processes related to cell differentiation, morphogenesis, and organ development, including skeleton system  Table 2). The enrichment analysis in Reactome database also pointed to signaling pathways related to development and suggested the involvement of transcriptional factors usually associated with the early stages of development, such as RUNX2, POU5F1, SOX2, and NANOG (Table 3), although they were not among the enriched transcriptional factors in the TRANSFAC database (Supplemental Table S3). Because these genes seem to be related to developmental processes, the 350 DMR genes were also searched in the Chemical and Genetic Perturbations (CGP) database. Again, several categories pointed to the involvement of mechanisms usually associated with development, including 55 genes that harbor the trimethylated H3K27 (H3K27me3) mark in their promoters in human embryonic stem cells, brains, and liver, or have been identified by ChIP on chip as targets of the Polycomb proteins EED and SUZ12. Moreover, eight genes have been described to be de novo methylated in cancer (ALX3,   Table S4). From 350 genes mapped to the DMRs, 163 were represented in the expression platform from Kuijjer et al. (2012), from which 64 were considered differentially expressed genes (p < 0.05). Unsupervised clustering based on their expression discriminated normal osteoblasts from all osteosarcoma samples (Supplemental Figure S5). Similar to the methylation data, the 64 differentially expressed genes were enriched for biological processes related to cell differentiation, morphogenesis and organ development, cell motility, RNA-related processes, and cell proliferation among others.
Since the DMR genes pointed to the enrichment of metabolism-related biological processes, we also took a closer look at the levels of metabolic compounds present in the peripheral blood of the same patients of this study (analysis reported in Quintero Escobar et al. (2020)). The investigation revealed high levels of histidine and glutamine, which are 2-OG precursors, then valine as a succinyl-CoA precursor, and phenylalanine and tyrosine (Fig. 2) that can be metabolized to fumarate.

Occurrence of CNAs at DNA methylation enzymes
Following, we searched for genomic alterations in enzymes that mediate DNA methylation that might have contributed to the observed epigenetic profile of tumors. One sample harbored a likely benign missense mutation in DNMT1, while there were several CNAs in both families of DNA methylases and demethylases (Fig. 3), mostly gain events affecting DNMTs and losses in TET genes. DNMT3B stood out as the gene with the highest number of gains (46% of the samples), and TET1 with the highest number of losses (39%).
Other epigenetic modifiers were also affected by alterations, mostly losses: the chromatin remodeler ARID1B (two missense variants, three gains, ten losses); the histone methyltransferase SETD2 (one LoF variant, one missense, one gain, and 12 losses), and the methyl-CpG-binding domain protein MBD1 (one LoF variant, three gains, and six losses) (Supplemental Table S5; Supplemental Table S6).
Expression levels of DNMT3B were tested in a subgroup of samples, reporting a substantial increase (fivefold, p value = 0.03, t test) in osteosarcoma (n = 8) compared to normal bone tissues (n = 3) (Fig. 4).

Discussion
Osteosarcomas are highly complex and heterogeneous tumors. A large number of genes can be mutated though few are recurrent. Rather, these tumors have aneuploidy, chromothripsis, and uncontrolled cell cycle, suggesting defects in DNA repair mechanisms and/or epigenetic instability (de Azevedo et al. 2020). Nevertheless, the high diversity of genomic alterations may share similar functional modules related to events, such as DNA damage, response to stress, epigenetic processes, mitosis, and cellular motility (Poos et al. 2015). The reversibility of epimutations turns them into interesting targets for the development of therapies. In osteosarcomas cell lines, the demethylating agent decitabine (5-aza-dC, a DNMT1-inhibitor) can promote the recovery of gene expression silenced by hypermethylation, resulting in reduced cell growth (Asano et al. 2019) or increased sensibility to chemotherapeutic agents at concentrations within therapeutic levels (Chaiyawat et al. 2020). Therefore, DNA methylation profiles may directly point to robust disrupted mechanisms in osteosarcomas.
The genome-wide methylation characterization of 28 osteosarcomas showed a significant epigenetic heterogeneity, as shown by a coefficient of variation (CV = 0.7), much higher than other solid tumors (~ 0.08 in Ewing sarcoma; ~ 0.03 in glioblastoma) (Agirre et al. 2015;Sheffield et al. 2017). It is an expected finding as these tumors usually present a high mutational and chromosomal alteration burden (Rickel et al. 2017), particularly compared to tumors with few genetic alterations such as Ewing sarcoma (Sheffield et al. 2017). It could indicate that the high DNA methylation heterogeneity is an intrinsic event of osteosarcoma tumorigenesis or that is related to the high number of cell types found in these tissues. This second hypothesis is supported by the high CV in control bone samples (0.73), which raises the possibility that DNA methylation is intrinsic to the bone/cell composition and not related to the tumorigenesis itself. We had one exception among our samples that grouped with the bone samples in the unsupervised clustering; this sample also had few SNVs and no detectable CNAs. At a closer look at the clinical and pathological parameters, this case was not different from the others, suggesting that this biological behavior may be explained by the intratumor heterogeneity, with the sample used for the experiments being epi-and genetically not representative of the tumor as a whole.
Using our set of differentially methylated CpGs in an independent group of samples has discriminated osteosarcomas from non-tumoral bone samples, therefore validating our analysis. Considering the DMRs genes, we applied a similar analysis based on the expression profiles which also separated osteoblast from osteosarcoma samples. Therefore, we consider that, although we used a small number of samples, these results indicated that our analysis was robust. Osteosarcomas displayed a global hypomethylation with a pattern of CpG islands' hypermethylation, similar to other cancers (Timp et al. 2014;Vidal et al. 2017). Global hypomethylation is usually related to the activation of oncogenes and increase of genomic instability, while hypermethylation of CpG islands usually results in the inactivation of tumor suppressors and other regulatory elements (Pfeifer 2018). Consistently, among the genes contained within hypermethylated DMRs, we identified 17 tumor suppressors, from which six were also affected by alternative inactivation mechanisms, such as deletions (DLEC1, GJB2, HIC1, MIR149, PAX6, and  WNT5A). Silencing by aberrant promoter methylation of cell proliferation regulators and genomic stability maintainers, such as DLEC1 and HIC1, is a recurrent event in a variety of cancers (Seng et al. 2008;Ying et al. 2009;Zheng et al. 2013;Özdemir et al. 2018). We also noted that seven tumor suppressor genes and nine oncogenes presented expression and methylation patterns contrary to what we would expect. These genes may not have a role in osteosarcomagenesis and their alterations can be only passenger mutations, as a consequence of the genomic instability. Another plausible hypothesis is that they may play a role not yet recognized in osteosarcoma. Three of these genes are microRNAs (MIR149, MIR206, and MIR494), that are known to have a widespread regulatory activity, usually dependent on the context of the cells (Krol et al. 2010).
We hypothesize that the preferential hypermethylation pattern at CpG islands has a connection with DNMT3B gains in osteosarcomas and genetic perturbations related to H3K27me3 and PRC2. DNMT1 and DNMT3A also had extra copies in some samples. DNMT1 overexpression is a recurrent event across osteosarcomas, with a negative impact on the expression of several tumor suppressor genes (Chaiyawat et al. 2020). DNMT3A and DNMT3B seem to have both redundant and exclusive targets; while DNMT3A remains active in adults , expression of DNMT3B catalytic isoforms decreases during cell differentiation (Gifford et al. 2013).
DNMT3B-induced methylation coincides with H3K27me3 and results in methylation of the surrounding CpG island DNA, affecting PRC2 recruitment, a mechanism involved in the tumorigenesis of gliomas (Harutyunyan et al. 2019). The involvement of H3K27me3 in osteosarcomas is strengthened by the fact that 87% of the tumors from young patients present imprinting defects at the 14q32-locus, an enriched region for both the active marker H3K4me3 and the silence marker H3K27me3 (Shu et al. 2016).
An enrichment of hypomethylated DMRs was reported at 6p22, where several histone-coding genes are located. Samples exhibited both gains and losses events encompassing 6p22, reinforcing the genomic instability. Interestingly, only gains of 6p/6p22 were reported as a recurrent event in a variety of cancers, including osteosarcomas (Man et al. 2004;Lim et al. 2004), with impact in tumor progression, aggressiveness, and metastatic potential (Santos et al. 2007). In agreement with the hypomethylation found in 6p22, a cluster of over-expressed genes in 6p has been reported .
DMRs often overlapped with CNAs in our analysis. It is possible that at least part of the DMRs' hypomethylation was induced by the unbalance between copy-number gains and the reestablishment of original DNA methylation patterns. Both mutational processes are associated with high genomic instability (Pfeifer 2018;de Azevedo et al. 2020;Franceschini et al. 2020). Furthermore, several genes related to DNA damage response and repair, such as BAP1, BRCA1, BRCA2, PALB2, PARP1, PTEN, RB1, SETD2, and TP53, presented copy-number losses or loss-of-function mutations (but no differential methylation), although these pathways were not enriched in our analyses. TP53 is not a recurrent target for epigenetic regulation, but changes in the pathways or genes modulated by its activity have already been reported (de Azevedo et al. 2020). Thus, further studies are necessary to disclose the joint role of DNA methylation and chromosomal alterations in the tumorigenesis of osteosarcomas, mainly targeting the gene expression patterns associated with these changes.
There are limited data from published studies that evaluated DNA methylation in osteosarcoma and bone samples, either because the raw or normalized profiles were not available or because they originated from commercial cell lines. For instance, we identified three studies which evaluated methylation profiles of primary osteosarcomas from patients but not normal bone samples (Wu et al. 2017;Koelche et al. 2018;Lietz et al. 2022). The design from these studies prevented us from testing our data, because our list of CpG sites came from the comparison between tumor and normal samples.
Considering the commercial cell lines, methylation data from 19 OS cell lines revealed 328 genes with at least one differentially expressed CpG site located at the promoter region . From these, 62 genes overlapped with our list of DMR genes. However, considering the technical differences between the approaches (Illumina 27 K and 450 K Beadchip arrays), we decided to compare biological processes instead of individual genes. The enriched biological processes within the DMR genes suggested the involvement of mechanisms associated with all steps of cellular differentiation, similar to other studies Asano et al. 2019) and consistent with the period of onset of the disease (Morrow and Khanna 2015;Misaghi et al. 2018). Osteogenic lineages generated by the induction of differentiation from adipose-derived stem cells present a DNA methylation pattern similar to that of osteocytes, with the majority of CpG sites that change DNA methylation located at promoter-associated CpG islands (Berdasco et al. 2012). Therefore, dysregulation of DMR genes, as we described, might be associated with the interruption of the differentiation process. Accordingly, among the biological processes, skeletal development, represented by 28 genes (Table 2), was enriched within the DMRs.
We reported alterations of TCA anaplerotic substrates and a very high glycolysis rate in the peripheral blood of these same patients (Quintero Escobar et al. 2020). Alterations in intermediates of the TCA cycle (histidine and glutamine, which are 2-OG precursors; valine as a succinyl-CoA precursor; and phenylalanine and tyrosine, that can be metabolized to fumarate) can interfere with TET1 activity (Fletcher and Coleman 2020), what ultimately could result in the global DNA hypomethylation found in osteosarcomas.
We emphasize that only frozen samples with more than 70% of the cells of interest were selected for this study, because the ratio of contamination of normal cells affects the data of Infinium HumanMethylation 450. This criterion resulted in the main limitation of this study, which was the lack of paired normal and tumor samples. This prevented us from safely determining the spectrum of somatic alterations, methylation, or mutation. To overcome these, we used normal bone tissues in the methylation analysis and applied restrictive data filtering criteria, excluding any variant present in population databases.
Overall, we reinforce that aberrant DNA methylation is spread throughout the genome of osteosarcomas and is involved in the dysregulation of biological processes related to skeletal development and cell differentiation. While DNA hypomethylation contributes to global genomic instability, CpG island hypermethylation enrichment suggests the involvement of a controlled mechanism likely related to the silencing of tumor suppressor genes. The fact that the differentially methylated CpGs and expression of genes located at the DMRs were able to discriminate osteosarcoma and bone samples in independent group of cases shows some consistency between independent cohorts and points to a common mechanism that regulates DNA methylation lost in the osteosarcoma samples. In addition, DNMT3B had a higher expression in osteosarcomas. Although we were not able to evaluate the expression of TET1 and the other DNA methylation-related enzymes, we believe that the results from the methylation data indicate an underlying mechanism that involves DNA methylation for osteosarcomagenesis. These findings suggest that DNMT3B may be a promising gene for further evaluation in the context of methylome variation in osteosarcomas.

Acknowledgements
The authors would like to thank the patients and their families for the collaboration. This work was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior and Fundação de Amparo à Pesquisa do Estado de São Paulo.
Data availability Methylation data are available in NCBI Gene Expression Omnibus (GEO) under accession number GSE193000.

Conflict of interest None declared.
Ethics approval and consent to participate This study was conducted following national and institutional ethical policies with approval from the Barretos Cancer Hospital Ethical Committee (CEP-HCB 898.403). Informed consents were obtained from the patients or their legal guardians.