Targeting CDA-Directed Metabolism With 5-Formyl-2'-Deoxycytidine For ALK Inhibitor–Resistant Lung Cancer Therapy

Background: Acquired resistance to inhibitors of anaplastic lymphoma kinase (ALK) is a major clinical challenge for ALK fusion–positive non-small-cell lung cancer (NSCLC). In the absence of secondary ALK mutations, epigenetic reprogramming is one of the main mechanisms of drug resistance as it leads to phenotype switching that occurs during the epithelial-to-mesenchymal transition (EMT). While drug-induced epigenetic reprogramming is believed to alter the sensitivity of cancer cells to anticancer treatments, there is still much to learn about overcoming drug resistance. Methods: We used an in vitro model of ceritinib-resistant NSCLC and employed genome-wide DNA methylation analysis in combination with single-cell (sc) RNA-seq to identify cytidine deaminase (CDA), a pyrimidine salvage pathway enzyme, as a candidate drug target. Molecular biology was used to characterize the role of CDA in drug resistance. Integrated analysis of scRNA-seq and scATAC-seq identiﬁed gene regulatory networks in resistant cells. Clinical relevance of CDA was evaluated using TCGA datasets, patient-derived cells, and tumor biopsies. Results: CDA was hypomethylated and upregulated in ceritinib-resistant cells. CDA-overexpressing cells were rarely but deﬁnitively detected in the na¨ıve cell population by scRNA-seq, and their abundance increased in the acquired-resistance population. Knockdown of CDA had antiproliferative e ﬀ ects on resistant cells and reversed the EMT phenotype. Treatment with epigenome-related nucleosides such as 5-formyl-2’-deoxycytidine selectively ablated CDA-overexpressing resistant cells via accumulation of DNA damage. Conclusions: Targeting CDA metabolism using epigenome-related nucleosides represents a potential new therapeutic strategy for overcoming ALK-inhibitor resistance in NSCLC.


Background
Anaplastic lymphoma kinase (ALK) is a receptor tyrosine kinase that is expressed in the nervous system, testes, and small intestine in adult humans [1].Chromosomal rearrangements of ALK result in fusions with more than 20 different genes that have been documented thus far, and ALK fusion proteins drive tumorigenesis in many different cancers [2].A fusion between EML4 (echinoderm microtubule-associated protein-like 4) and ALK was identified in non-small-cell lung cancer (NSCLC) in 2007 [3].Since then, ALK fusions have been detected in 3-7% of patients with NSCLC and have been associated with non-smokers and younger age [4].
ALK inhibitors are highly effective at treating patients with ALK fusion-positive NSCLC, but the inevitable emergence of chemotherapeutic drug resistance limits their utility [2].Genetic mechanisms such as secondary mutations in the ALK kinase domain explain the drug resistance observed in 20% and 50% of patients with firstgeneration (crizotinib) and second-generation (ceritinib and alectinib) treatment, respectively [5].This suggests that non-genetic mechanisms may contribute to drug resistance by modulating transcriptome plasticity [6].It was recently suggested that drug-induced epigenetic reprogramming can alter the sensitivity of cancer cells to anticancer treatments [7].Indeed, epigenome characteristics such as DNA methylation, histone modifications, and non-coding RNAs contribute to gene expression changes for adaptation in response to anticancer therapies [8], but there is still much to learn about how epigenetic mechanisms contribute to drug resistance.
Acquisition of nucleotides is critical for proliferating cells to replicate DNA.Nucleotides are produced either by de novo synthesis or by salvage pathways that recycle nucleobases from intracellular degradation processes or are acquired via extracellular uptake [9].Cytidine deaminase (CDA) is an enzyme of the pyrimidine salvage pathway that catalyzes the deamination of cytidine and deoxycytidine to produce uridine and deoxyuridine, respectively [10].Epigenome-related nucleosides such as 5-hydroxymethyl-2'-deoxycytidine (5hmdC) and 5-formyl-2'-deoxycytidine (5fdC) are produced by ten-eleven translocation (TET) methylcytosine dioxygenases [11].Most cells can scavenge epigenome-related nucleosides and maintain genomic integrity, but cancer cells overexpressing CDA convert 5hmdC and 5fdC into variants of uridine, which accumulate in the DNA and result in DNA damage and potentially cell death [12].Therefore, epigenome-related nucleosides could present a new strategy for targeting tumors overexpressing CDA [13].
We previously reported that genomic enhancer remodeling and microRNA alterations can drive the epithelial-to-mesenchymal transition (EMT) and promote chemotherapeutic resistance to ALK inhibitors [14].In this study, we examined genome-wide changes in DNA methylation in acquired resistance to the anticancer drug ceritinib and identified CDA among the hypomethylated and upregulated genes in drug-resistant cells.Single-cell RNA sequencing scRNA-seq revealed rare CDAoverexpressing cells in the naïve cell population (those without acquired resistance) and that CDA-overexpressing cells with acquired resistance thrive when exposed to ceritinib.Finally, we found that treatment with epigenome-related nucleosides such as 5fdC might be a promising therapeutic strategy for overcoming ALK-inhibitor resistance in CDA-overexpressing NSCLC.
Bulk RNA-seq analysis Bulk RNA-seq was performed as described [14].The RNA-seq library was prepared using the TruSeq RNA Sample Prep kit (Illumina), and sequencing was performed using the Illumina HiSeq2000 platform to generate 100-bp paired-end reads.Sequenced reads were mapped to the human genome (hg19) using STAR (v.2.5.1) [15], and gene expression was quantified using the count module in STAR.The edgeR package (v.3.12.1) [16] was used to select differentially expressed genes (DEGs) from the RNA-seq count data.The 'trimmed mean of M-values' normalized value for each gene (in counts per million, cpm) was set at 1 and Log 2 -transformed for further analysis.
Quantitative reverse transcription-PCR (qRT-PCR) RNA was isolated using the RNeasy Plus Mini kit (Qiagen, 74136) and measured with a NanoDrop ND-1000 spectrophotometer (Agilent).cDNA was synthesized from 1 µg RNA using the iScript cDNA Synthesis kit (Bio-Rad, 1708890).Real-time PCR was conducted in triplicate for each sample according to the manufacturer's instructions (Bio-Rad, 170-8880AP).The value for each gene was normalized to the β-actin signal.Supplementary Table 1 lists the sequences of primers.

DNA methylation microarray analysis
Genome-wide DNA methylation was analyzed in duplicate using the Infinium methylation 450K bead chip array (Illumina).The resulting DNA methylated/ unmethylated signal intensity data were imported into R (v.3.4.2) for analysis.The data were normalized using the subset-quantile within array normalization method with background correction using the minfi (v.1.30.0)R package.CpG methylation values were calculated as average β values.Measurements with P < 0.05 were considered to be significant above background.To identify differentially methylated CpGs between H3122 and LR cells, statistical analysis was performed using the DMRcate 4 R package (v.1.20.0).Differentially methylated CpGs with P < 0.05 and an average difference > 15% were selected.

Gene-set and pathway analysis
Gene-set and pathway analysis was performed on DEGs associated with differentially methylated regions using DAVID (v.6.8) and enrichR.A cut-off threshold of P < 0.05 was used to obtain significantly enriched pathways.Gene sets enriched with at least three genes were considered for further analysis.
Preparation and sequencing of scRNA-seq library Samples were prepared as outlined in the 10× Genomics Single Cell 3' v2 Reagent kit user guide.The single-cell RNA-seq (scRNA-seq) library was prepared using the Chromium Single Cell 3' Library and Gel Bead kit V2 (PN-120237), Chromium Single Cell 3' Chip kit V2 (PN-120236), and Chromium i7 Multiplex kit (PN-120262) with the 10× Genomics Chromium instrument.Samples were sequenced on a HiSeq 2500 with the following run parameters: read 1, 26 cycles; read 2, 98 cycles; index 1, 8 cycles.A median sequencing depth of 60,000 reads/cell was targeted for each sample.Supplementary Table 2 lists the 10× Genomics web summaries for each sample profiled.
scRNA-seq data processing and analysis Data preprocessing was performed following the pipeline provided by 10× Genomics.The process of converting raw sequencing data (Base Call files) into FASTQ files and aligning to the human reference genome (hg19) was performed using 10× Genomics cellranger software.The Cell Ranger count module was used to generate the raw gene expression matrix, and the Cell Ranger aggr module was used to aggregate the H3122 and LR data.The rest of the process was performed using R (https://www.R-project.org/ ) and a guided analysis pipeline from Seurat 5, v.2.4 [17].
Cell and gene filtering processes included only genes expressed in at least three cells and cells expressing at least 200 genes.Cells with a mitochondrial genome transcript ratio greater than 50% were considered lysed and were excluded.For data normalization, the global-scaling normalization method LogNormalize (Seurat 5, v.2.4) was used with default parameters.The selection of variable genes, scaling of data, and linear dimensional reduction were performed sequentially following the Seurat-guided analysis pipeline.To determine the dimensionality of the dataset, we analyzed the elbow plot and determined 10 principle components.Cells were clustered using the Seurat FindClusters module, and clusters were visualized using uniform manifold approximation and projection (UMAP).To find DEGs in each cluster, we used the FindAllMarkers module (Seurat 5, v.2.4).

Bisulfite sequencing
Genomic DNA was isolated using the DNeasy Blood and Tissue kit (Qiagen, 69506).Unmethylated cytosines were converted into uracil by sodium bisulfite using the EZ DNA Methylation-Gold kit (Zymo Research, D5005).Modified DNA was amplified with primers targeting the CpG sites (cg04087271, cg20619374 and cg06984156) of CDA (Supplementary Table 3).Gel-purified bands were extracted using the Gel Extraction kit (Qiagen, 28706) and cloned into pGEM-T Easy vector (Promega, A1360).Multiple plasmid DNA was isolated using the HTS Plasmid kit (Core Bio System, PHTS-30).Subsequently, Sanger DNA sequencing was performed by GenoTech.
Cell proliferation assay using the IncuCyte system Cell proliferation rates were determined based on cell confluency by live-cell imaging using the IncuCyte ZOOM System (Essen Bioscience).H3122 or LR cells were seeded in 96-well plates (Corning, 3596), and photomicrographs were taken at 2h intervals from four separate regions per well using a 10× objective.Cultures were maintained in a 37°C incubator.Cell confluency was measured using IncuCyte software (Essen Bioscience).

Wound healing assay
To assess wound healing, 2×10 5 cells were seeded in a 35-mm culture insert µ-Dish (IBIDI, 81176) and incubated overnight at 37°C.When cells were attached, the insert well was removed and fresh culture medium was added.After 15 h, images of wounds were captured with a Nikon Eclipse Ti-S microscope with a 10× objective.The average width of wounds at different intervals was determined after cell migration, and the distance between wound edges was calculated.Wound closure area was measured using ImageJ software.

Cell migration and invasion assay
The membrane of Transwell inserts (Corning, 3421) was coated with 0.25 mg/ml fibronectin (Sigma-Aldrich, F2006) or 1 mg/ml Matrigel (Corning, 354248).Then, 5×10 4 cells were seeded on the insert and incubated for 48 h.Cells on the upper membrane that had not migrated or invaded were removed with cotton swabs.Migrating and invading cells were stained and fixed with 0.5% crystal violet, 3.7% formaldehyde, and 30% ethanol for 2 h.The central portion of the membrane was imaged on a Nikon Eclipse Ti-S microscope with an Intensilight C-HGFI illuminator and 10× objective.Migrating and invading cells were manually counted from four sites in each image.
Preparation and sequencing of scATAC-seq library Samples were prepared as outlined in the 10× Genomics Single Cell ATAC Reagent kit v1.1 user guide.A single-cell assay of transposase-accessible chromatin (scATAC) library was prepared using the Chromium Next GEM Single Cell ATAC Library and Gel Bead kit v1.1 (PN-1000175), Chromium Next GEM Chip H Single Cell kit v1.1 (PN-1000161), and Single Index kit (PN-1000212) with the 10× Genomics Chromium instrument.Samples were sequenced on a HiSeq 2500 with the following run parameters: read 1, 50 cycles; read 2, 50 cycles; index 1, 8 cycles; index 2, 16 cycles.A median sequencing depth of 25,000 reads/nucleus was targeted.The 10× Genomics web summary can be found in Supplementary Table 5.

scATAC-seq data processing and analysis
Cell ranger-atac (v1.2.0) demultiplex raw Base Call files were generated by Illumina sequencers and converted into FASTQ files using the cellranger mkfastq module.Chromium scATAC libraries include paired-end read 1 (containing the 50-bp insert) and read 2 (50-bp insert), the 10× barcode in the i5 index (16 bp) read, and the sample index in the i7 index (8 bp) read.FASTQ files were aligned to the human reference genome (hg19) using the cellranger-atac count module, and the web summary, peak count matrix, and transcription factor (TF) count matrix were generated.Quality control was conducted using fraction reads in cells, the mapping quality, insert sizes, and the transcription start site (TSS) profile in the web summary following the 10× Genomics recommendation.The number of reads per cell exceeded 50,000 to saturate coverage.The peak count matrix and bam files were used as input for downstream analyses identifying cell clusters and differentially accessible regions (DARs).

Data integration, dimensionality reduction, clustering, and DAR analysis in scATAC-seq
The scATAC-seq datasets of LR cells produced 7,753 filtered nucleus barcodes.Overall scATAC-seq analysis was performed using the SnapATAC package (v.1.0.0) [18], which consists of snaptools for preprocessing and SnapATAC for clustering, annotation, motif discovery, and downstream analysis.First, a snap (single-nucleus accessibility profile) file was generated from bam files from cellranger-atac using snaptools (https://github.com/r3fang/SnapATAC/wiki/FAQs).This step included the addition of the cell-by-bin matrix to the snap file.The bin size was set to 5 kb following the recommendation of snaptools.Second, the snap file was loaded with snaptools, and our scATAC-seq dataset was integrated for batch effect correction using the harmony module in SnapATAC.Third, for high-quality barcodes, filtering was performed using the default parameters in SnapATAC, except that the mitochondrial genome was not included.The rest of the process was carried out according to the SnapATAC pipeline.For integration of the scRNA-seq and scATAC-seq datasets, a cell-by-gene matrix (gene activity score) was added based on the gene body region after conversion to a Seurat object using the snapToSeurat module in SnapATAC.Subsequent processing was based on the scATAC-seq and scRNA-seq integration section of Seurat.Using the TransferData module in Seurat, pseudo-multiomics were created for each cell of scATAC-seq.Cells with a prediction score less than 0.3 were discarded.Significantly enriched peaks for each cluster were distinguished with the runMACS module [19] in SnapATAC, and a narrow peak and bedgraph were generated for each cluster.DARs were analyzed using the findDAR module with the cutoff set to a P value of less than 0.05 and a Log 2 (fold change) of greater than 0. For the interaction score, the predictGenePeakPair module was used and the association between open chromatin regions (OCRs), and gene expression within ±250 kb centered on the TSS was calculated as Log 10 (P value).Motif analysis for DARs was performed using HOMER (v4.11.1) [20] by employing the runHomer module in SnapATAC.Motifs were analyzed based on 392 TFs in HOMER.pyGenomeTracks (v.3.5.1) [21] was used to summarize scATAC-seq with a genome browser.

Flow cytometry-based apoptosis assay
Cells were seeded in 100-mm culture dishes at 1.5×10 6 per dish prior to experimental treatment.Apoptosis assays were performed using the FITC Annexin V Apoptosis Detection kit I (BD Biosciences, 556547).Cells were analyzed using a BD FACSCalibur flow cytometer and CellQuest Pro software (BD Biosciences).

Immunohistochemistry
Tissue sections for immunohistochemistry were cut at 4-µm thickness from paraffin blocks of patient samples.Sections were stained with anti-CDA (Abcam, ab222515) using the Benchmark XT autostainer (Ventana Medical Systems).Expression of CDA was determined using the tumor proportion score, which is the percentage of viable tumor cells showing cytoplasmic staining.All samples were independently reviewed by a pathologist (J.K.) in a blinded manner.

Statistical analysis
The number of biological replicates (n) is described in the figure legends.Sample size determination by statistics was not applied in this study.Statistical analysis was performed using R software (v.3.6.3).Data are presented as mean ± SEM or SD as indicated in the figure legends.The researchers involved in this study were not completely blinded for the animal experiments but were blinded for human data analyses.Statistical significance was evaluated between two groups using Student's t test or the Mann-Whitney U test, as appropriate.A P value of < 0.05 was considered significant.

Results
Changes in the DNA methylome are associated with acquired resistance to ceritinib To explore changes in the DNA methylome and transcriptome during development of acquired resistance to an ALK inhibitor in NSCLC, we performed 450K Bead-Chip DNA methylation analysis, RNA-seq, and scRNA-seq of the EML4 -ALK fusion-positive lung-cancer cell line H3122 and the established ceritinib-resistant cell line LR [14] (Fig. 1A-B).Among the genes detected with RNA-seq (n = 10,519), 13% (n = 1,403) were upregulated and 14% (n = 1,535) were downregulated in LR (fold change (FC) > 2, false-discovery rate (FDR) < 0.05; Fig. 1C).CYP4F11 (cytochrome P450 4F11), EDIL3 (EGF-like repeats and discoidin domains 3), and AXL (AXL receptor tyrosine kinase) were the most strongly upregulated genes, and DUSP6 (dual-specificity phosphatase 6) and GLB1L2 (galactosidase beta 1-like 2) were the most strongly downregulated genes.DNA methyltransferases and TETs were also downregulated in LR (Fig. 1D), suggesting that acquired resistance may involve changes in not only the transcriptome but also the DNA methylome.
To determine whether the observed transcriptional changes were associated with DNA methylation changes, we analyzed genome-wide methylation changes using the Infinium Human Methylation 450K array.A total of 23,426 CpGs were hypermethylated and 17,797 CpGs were hypomethylated in LR (|β value change| > 0.15, P < 0.05; Fig. 1E).Integrated analyses of DNA methylation and gene expression revealed that 1,002 genes, including XYLT1 (xylosyltransferase 1) and DUSP6,w e r e hypermethylated and downregulated, and 809 genes, including ANKRD2 (ankyrin repeat domain 2) and AXL, were hypomethylated and upregulated (Fig. 1F).The hypermethylated and downregulated genes were associated with apoptosis, cell proliferation, and MAPK (mitogen-activated protein kinase) signaling, whereas the hypomethylated and upregulated genes were associated with cell adhesion, cell migration, and Hippo signaling (Fig. 1G).These data suggested that transcriptional plasticity may drive stable epigenetic changes such as DNA methylation during development of ceritinib resistance.
To examine differentially methylated and expressed genes more closely, we first focused on DUSP6.It has been reported that reactivation of MAPK signaling is a hallmark of acquired resistance to ALK inhibitors in NSCLC [22] and that decreased expression of DUSP6, the MAPK phosphatase, promotes resistance to ALK inhibitors.Interestingly, CpG sites in exon 3 of DUSP6, but not the promoter region, were heavily methylated in LR (Fig. 1H), suggesting that these sites may be critical regulatory regions for DUSP6 transcription.Other negative regulators of MAPK signaling, including SPRY1, DAB2IP, ARRB1, DMD, CNKSR3, PTPRR, NLRP12, WNK2, SLC9A3R1, ERRFI1, and SPRY4, were also hypermethylated and downregulated in LR (Supplementary Fig. 1).AXL is a receptor tyrosine kinase associated with tumor cell proliferation, metastasis, EMT, and drug resistance [23].We previously showed that AXL activation during EMT is a primary feature of acquired resistance to ALK inhibitors [14,24].The AXL promoter region was demethylated in cells with acquired resistance (Fig. 1I).These data suggested that drug-induced changes in DNA methylation have a key role in converting a transient transcriptional state to a stably resistant state.
scRNA-seq reveals CDA-overexpressing cells in both resistant and non-resistant cells When attempting to identify tumor cell heterogeneity or driver cell populations, the traditional bulk RNA-seq method has limitations because it analyzes the gene expression profile of a mixture of cells.Recently, scRNA-seq technologies have allowed investigation of RNA expression differences on a cell-by-cell basis [25].To explore the heterogeneity of naïve and drug-resistant cells, we performed scRNA-seq using the 10× Genomics Single Cell 3' Solution and obtained gene expression profiles for 9,401 cells.Clustering analysis of scRNA-seq data revealed that H3122 cells (n = 4,835) and LR cells (n = 4,566) were divided into 12 clusters based on uniform manifold approximation and projection (Fig. 2A).Clusters 10 and 5 consisted mostly of H3122 cells, whereas clusters 2, 7, 8, and 11 had mostly LR cells.Most LR cells were in the G2M or S phase of the cell cycle (Fig. 2B, Supplementary Fig. 2A), whereas most H3122 cells were in G1, suggesting that LR cells were more proliferative than H3122 cells.Differential expression analysis with bulk RNA-seq data allows comparison of a limited number of biological replicates, but scRNA-seq can identify key players in subpopulations of cells [26].We identified 761 upregulated genes and 401 downregulated genes in LR cells (Log 2 FC > 1.5, FDR < 0.001; Fig. 2C, Supplementary Table 6).EMT, cell cycle, and drug metabolism pathways were enriched in LR cells (Supplementary Fig. 2B), whereas response to endoplasmic reticulum (ER) stress and cell adhesion molecules were enriched in H3122 cells (Supplementary Fig. 2C).As expected, AXL was upregulated and DUSP6 was downregulated in LR cells (Fig. 2C).Intriguingly, although CDA-overexpressing cells were detected mainly among LR cells, they were also found rarely (yet definitively) in the naïve H3122 cell population (Fig. 2D-E).CDA is a nucleoside metabolism enzyme involved in homeostasis of the cellular pyrimidine pool [27].CDA-overexpressing cells in the naïve cell population showed enriched expression of cancer stem cell-related genes such as S100A10 (S100 calcium-binding protein A10) [28], LGALS1 (galectin-1) [29], and SH3BGRL3 (SH3 domain-binding glutamate rich protein-like 3) [30] (Supplementary Table 7).These data suggested that CDA-overexpressing cells may have a growth advantage during ceritinib treatment.CDA mRNA expression was increased 12-fold in LR cells, and methylation was reduced in promoter and putative enhancer regions of CDA (Fig. 2F).
To investigate whether CDA expression is increased in acquired resistance to other tyrosine kinase inhibitors (TKIs), we analyzed bulk RNA-seq data for crizotinibresistant H3122 cells [31].Consistent with our observations for ceritinib resistance, CDA and AXL were upregulated and DUSP6 was downregulated in crizotinibresistant cells (Supplementary Fig. 2D).Further, we analyzed scRNA-seq data for PC9 cells treated with erlotinib, the EGFR TKI, for 0, 1, 2, 4, 9, and 11 days [32].Astonishingly, CDA was upregulated even on day 1, whereas AXL was upregulated only after 9 days (Fig. 2G).On day 3 of erlotinib treatment, CDA expression was increased in the overall clusters, whereas AXL expression was decreased compared with day 0 (Fig. 2H).These data suggested that CDA-overexpressing cells may have a growth advantage during TKI treatment, causing them to be selected during treatment and thereby contributing to acquired resistance.

CDA depletion reverses EMT and reduces proliferation and migration of cells with acquired resistance
Western blotting confirmed that cellular CDA level was also increased in resistant cells (Fig. 3A).To elucidate the function of CDA in acquired resistance, we depleted CDA in LR cells using three different CDA siRNAs.All three siRNAs reduced the levels of both CDA mRNA and protein (Fig. 3B-C).Notably, CDA depletion reduced LR cell proliferation (Fig. 3D).In addition, treatment with tetrahydrouridine, a competitive inhibitor of CDA [33], had a dose-dependent inhibitory effect on proliferation of LR but not H3122 (Fig. 3E).Because CDA expression correlated with expression of EMT-related genes (Fig. 2I), we next examined whether CDA knockdown in LR could reverse the EMT phenotype.As expected, CDA depletion increased expression of the epithelial marker E-Cadherin and decreased expression of the mesenchymal markers N-Cadherin and Vimentin (Fig. 3F).Furthermore, CDA depletion reduced wound healing (Fig. 3G) as well as cell migration and invasion (Fig. 3H).
Next, RNA-seq was used to investigate transcriptome changes caused by CDA depletion in LR .CDA depletion led to downregulation of EMT-related genes such as SERPINB2, TGFBR1, and DRD5 as well as upregulation of embryonic development and morphogenesis genes such as GDNF, NOG, and HOXB9 (Fig. 3I, Supplementary Table 8).Five of the 25 previously identified CDA-linked genes (Fig. 2I) were downregulated by CDA silencing (CAV2, HIST1H2BK, TXNRD1, TRIM16L, and MSC ; Fig. 3I-J).CAV2 (Caveolin 2) mediates vesicular transport and signal transduction pathways.It modulates expression of EMT-related genes [34] and is associated with the cancer stem cell features of drug-resistant breastcancer cells [35].HIST1H2BK (histone cluster 1 H2B family member k, also known as H2BC12) regulates gene expression through chromatin remodeling.Overexpression of HIST1H2BK is associated with poor prognosis in low-grade glioma [36] and with cancer dormancy signatures and metastasis in breast cancer [37].TXNRD1 (thioredoxin reductase 1) is part of the thioredoxin system involved in cell proliferation, apoptosis, and metastasis.TXNRD1 is also involved in TGF-β-induced EMT in salivary adenoid cystic carcinoma [38].TRIM16L (tripartite motif containing 16 like, also known as TRIM70) is a member of the TRIM protein family, which is implicated in inflammation, immunity, and cancer.TRIM16L is upregulated in lung cancer and has a role in proliferation and migration [39].MSC (musculin, also known as ABF-1) is a helix-loop-helix TF with a key role in neoplastic changes of B cells in Hodgkin lymphoma [40] as well as TGF-β-induced regulatory T-cell development [41].Taken together, these data suggested that CDA has a key role in EMT and may improve cell survival by increasing both the proliferation and migration of cells with acquired resistance.

scATAC-seq reveals gene regulatory networks in cells with acquired resistance
The accessible chromatin landscape of the resistant LR cells was established using scATAC-seq, which yielded profiles for 7,753 nuclei, with a median of 6,821 fragments mapped per nucleus (Supplementary Table 5).To identify cis regulatory elements controlling gene expression in LR cells, we performed integrated analysis of scRNA-seq with scATAC-seq.Based on scRNA-seq data, scATAC-seq data for each cluster were predicted with a gene activity score using SnapATAC software (Fig. 4A) [18].To evaluate the relationship between OCRs and gene expression, interaction scores were calculated by comparing the expression level of each gene with the chromatin accessibility of each OCR located within ±250 kb of the TSS.OCRs were then categorized into four groups based on interaction score: high, mid, low, and unrelated (Fig. 4B).The high group consisted of 13.5% of all OCRs and included more promoter regions than any other group.Gene Ontology analysis revealed that OCRs of LR cells were proximal to genes involved in cellular responses to stress, regulation of cell death, and cell migration (Fig. 4C).To dissect causal TFs responsible for the transcriptome profile of LR cells, we performed TF motif analysis of OCRs using HOMER [20], revealing an enrichment of TF-binding signatures such as the ATF3, BATF, and NRF2 sequence motifs (Fig. 4D).
Looking specifically at CDA, we identified 10 OCRs (R1-R10) within ±250 kb of the TSS that were significantly associated with CDA expression (Supplementary Fig. 3A-B).OCRs R3-R7 were located in the CDA gene body, and R6 had the most significant correlation with CDA expression (Fig. 4E, Supplementary Fig. 3C), suggesting that it may be a strong enhancer of CDA.Supplementary Fig. 3D-E lists the top TFs predicted to bind the 10 OCRs.Among the CDA-associated TFs (Supplementary Table 9), we focused on FOXM1, TEAD1, and SMAD3.FOXM1, a proliferation-specific TF, mediates EMT-associated EGFR TKI resistance [42]; it promotes the rapid cancer-cell proliferation of small-cell lung cancer and is associated with poor prognosis [43].SMAD3 mediates transcriptional activation of EMT target genes in the TGFβ signaling pathway [44].TEAD1 is a key TF in various oncogenic signaling pathways, including the Hippo, Wnt, TGFβ, and EGFR pathways, and plays critical roles in EMT, metastasis, drug resistance, and cancer stem cells [45].To determine the effect of these TFs on expression of CDA,w ed e p l e t e d the TFs using siRNAs.Depletion of any one TF reduced CDA expression by 20%, and depletion of a combination of two or three TFs had synergistic effects (Fig. 4F).
Taken together, these data suggested that open chromatin structure may be formed in the promoter and enhancer of CDA at least in part by DNA demethylation in cells with acquired resistance.Furthermore, TFs such as TEAD1, SMAD3, and FOXM1 may be recruited to the regulatory region and induce overexpression of CDA, which promotes acquired resistance to ALK inhibitors (Fig. 4G).
5-Formyl-2'-deoxycytidine (5fdC) selectively ablates CDA-overexpressing cells CDA converts 5hmdC and 5fdC into 5hmdU and 5fdU, respective, both of which induce cytotoxicity when incorporated into DNA.Therefore, cytidine variants such as 5hmdC and 5fdC have been suggested as drug treatments for CDA-overexpressing cancers [12].We examined whether 5hmdC or 5fdC could inhibit LR cell proliferation.As expected, treatment of LR cells with 5hmdC or 5fdC decreased proliferation in a dose-dependent manner.Notably, high doses of 5fdC (10 µM) specifically decreased the proliferation of LR cells but not H3122 cells, suggesting that it selectively inhibits proliferation of CDA-overexpressing resistant cells (Fig. 5A).We then examined the effect of 5fdC on survival of LR cells and found that 5fdC attenuated their colony-forming ability (Fig. 5B) and increased apoptosis (Fig. 5C).To determine whether 5fdC would cause DNA damage specifically in LR cells, we carried out immunofluorescence staining for γ-H2AX, a marker of double-stranded DNA breaks.LR cells treated with 10 µM 5fdC showed a 2.5-fold increase in the number of cells with DNA damage (Fig. 5D).To investigate the effect of 5fdC on tumor growth, a xenograft assay was carried out using LR cells (Fig. 5E).Beginning on day 13 after the injection of LR cells, 5fdC (100 mg/kg) was administered each day by intraperitoneal injection.Compared with control mice, 5fdC-treated mice did not show significant changes in body weight but did exhibit reductions in tumor weight (Fig. 5E).Furthermore, tumors in 5fdC-injected mice showed a decrease in the number of proliferating cells and an increase in the number of cells with DNA damage (Fig. 5F).These results demonstrated that the CDA-overexpressing resistant cells were vulnerable to 5fdC owing to accumulation of DNA damage and that 5fdC could provide a potential strategy for overcoming ceritinib resistance.

Clinical relevance of CDA in lung-cancer patients
Based on our findings acquired with ceritinib-resistant cells, we searched the literature to determine whether upregulation of CDA is associated with resistance to other anticancer drugs (Supplementary Table 10).CDA expression was found to be increased 2.9-fold in MDA-MB-231 breast-cancer cells that are resistant to the CDK 4/6 inhibitor palbociclib [46] and 2.2-fold in HCT116 colorectal-cancer cells that are resistant to the MEK inhibitor trametinib [47].Ovarian-cancer cells had increased CDA during development of resistance to either the PARP inhibitor olaparib [48] or cisplatin [49].The EGFR-mutant PC9 lung-cancer cell line showed increased CDA expression during exposure to gefitinib, an EGFR TKI [50].These data suggested that CDA upregulation is a frequent event in chemoresistance.
To explore the clinical relevance of CDA in lung cancer, we analyzed CDA expression in lung adenocarcinomas using GEPIA [51], a tool for analyzing the Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) databases.Although the median CDA expression was similar between tumor samples (n = 483, obtained from TCGA) and normal tissue samples (n = 347, obtained from GTEx), ˜25% of the tumor samples overexpressed CDA compared with normal samples (Fig. 6A).In addition, the overall survival of patients with lung adenocarcinoma with high CDA expression was significantly lower than that of patients with low CDA expression (Fig. 6B).
To determine whether CDA expression is associated with the EMT signature in lung adenocarcinoma, we investigated EMT scores in low and high CDA expression groups by surveying a 16-gene signature of canonical EMT from TCGA datasets [52].Notably, the high CDA expression group had significantly higher EMT scores than the low group (Fig. 6C), suggesting that CDA has a key role in the EMT during lung-cancer progression.
We further examined CDA expression in primary cancer cells from ALKrearranged NSCLC patients who showed acquired resistance to crizotinib [53].Notably, CDA expression was higher in patient-derived resistant cells (SNU-2535, -2550, -2563) than in naïve ALK -positive NSCLC patient-derived cells (SNU-3166) at both the mRNA and protein levels (Fig. 6D, Supplementary Table 11).We obtained NSCLC tumor biopsies from 11 patients with EML4 -ALK rearrangements before and/or after ALK inhibitor therapy.Contrary to our expectation, immunohistochemistry of CDA showed no significant difference between pre-and post-ALK inhibitor therapy (Fig. 6E, left).This might be attributable to inter-or intratumoral heterogeneity of CDA expression even in the same patient [54].Representative CDA expression in tumors from pre-or post-TKI therapy is shown in Fig. 6E.

Discussion
Proliferating cancer cells undergo metabolic adaptations in order to survive in the harsh tumor microenvironment [55,56].Therefore, targeting cancer-specific metabolism may present an effective therapeutic strategy [55,57].Our current results demonstrate the epigenetic activation of CDA during development of ceritinib resistance in NSCLC with an EML4 -ALK fusion.scRNA-seq analysis identified CDA upregulation as one of the primary characteristics for ALK inhibitor resistance.CDA-overexpressing cells have a growth advantage during ceritinib treatment and are thus selected and propagated, thereby contributing to acquired resistance.Decreased methylation of the CDA promoter and enhancer, along with recruitment of EMT-related TFs, can at least partially explain the increased CDA expression in resistant cells.We propose that targeting CDA-directed metabolism with epigenome-related nucleosides such as 5fdC presents a new strategy for ablating ALK inhibitor-resistant cells via accumulation of DNA damage leading to cell death (Fig. 6F).
As de novo pyrimidine biosynthesis is an energetically expensive pathway for cell growth and development [58], CDA may provide an energetically efficient bypass for rapidly proliferating LR cells that can meet their pyrimidine requirements via a salvage pathway using either intracellular nucleic acid degradation products or extracellular nucleosides [59].We demonstrate here that inhibition of CDA using siRNA or tetrahydrouridine reduced LR cell proliferation (Fig. 3).Until now, many studies have shown that reprogramming of pyrimidine metabolism is closely related to cancer progression [60].Beyond DNA and RNA biosynthesis, the recycling of cytidine and uridine is involved in phospholipid synthesis and protein glycosylation (Fig. 6F).CDP-choline production using CTP and choline phosphate is a rate-limiting step for phosphatidylcholine biosynthesis in mammals.Thus, an intracellular CTP pool is critical for phosphatidylcholine production and subsequent phospholipid biosynthesis and membrane biogenesis [61].Glycosaminoglycans, derived from UDP and glucose, are distributed on the cell surface and in the extracellular matrix and play critical roles in cell-cell and cell-matrix adhesion, cell proliferation, and migration [62].Furthermore, the pyrimidine salvage pathway is involved in the ER stress response, and CDA both balances the pool of deoxycytidine and deoxyuridine upon ER stress and increases the adaptive capacity of cells [63].
CDA also has critical roles in the EMT.We demonstrated that CDA knockdown in LR cells reversed EMT and reduced cell migration and invasion (Fig. 3).Clinically, CDA expression is associated with the EMT signature in lung-cancer patients (Fig. 6C).The EMT is a dynamic process in which tumor cells can occupy intermediate EMT states (partial EMT) and can revert to a more epithelial phenotype through the reverse process, i.e., mesenchymal-to-epithelial transition.Epigenetic changes such as DNA methylation and histone modifications direct this dynamic process [64].We found that promoter and enhancer regions of CDA were demethylated in cells with acquired resistance and formed an open chromatin structure to bind TFs such as SMAD3, TEAD1, and FOXM1 (Fig. 4).CDA expression was also linked with EMT-related genes such as CAV2, TXNRD1, and HISTH2BK (Fig. 3J).The EMT has been associated with both metastasis and drug resistance [65].CDA is frequently overexpressed in many cancers, including pancreatic, stomach, testicular, and vaginal cancer [66].Moreover, CDA overexpression can mediate resistance to chemotherapy based on cytidine analogs such as gemcitabine [67].Intriguingly, CDA upregulation has been documented in cancer cells resistant to palbociclib (CDK 4/6 inhibitor) [46], trametinib (MEK inhibitor) [47], olaparib (PARP inhibitor) [48], and gefitinib (EGFR inhibitor) [50].Therefore, targeting CDA with epigenome-related nucleosides may improve the effectiveness of current strategies for overcoming resistance to these targeted therapies.
Modified nucleosides that are common in the epigenome can disrupt regulation of gene expression if they are recycled and incorporated into the DNA, and thus incorporation must be prevented in most healthy cells.In the case of 5-methyl-2'deoxycytidine (5mdC), the nucleoside can be recycled in a different form (dTTP) through deamination [27].In contrast, the oxidized epigenetic nucleosides 5hmdC and 5fdC cannot be converted to canonical nucleotides in normal cells because cytidine monophosphate kinase 1 (CMPK1) can phosphorylate only unmodified dCMP [12].In CDA-overexpressing cancer cells, however, 5hmdC and 5fdC can be deaminated to yield 5hmdU and 5fdU, respectively, which can be incorporated into DNA, leading to cell-cycle arrest and/or accumulation of cytotoxic double-stranded DNA breaks that cause cell death [12].We have demonstrated that administration of 5fdC inhibited the proliferation of ALK inhibitor-resistant NSCLC cells in vitro and in vivo (Fig. 5).Although our in vivo study was very small, 5fdC showed selective abolishment of highly proliferating and CDA-overexpressing cells with no adverse effects in mice.Further, more-extensive studies are needed to determine the efficacy and safety of 5fdC.
This study provides a proof of concept that single-cell transcriptome analysis can identify key players in acquired resistance to cancer therapy and that metabolic mechanisms may offer a vulnerability in such cancer cells that can be utilized to overcome resistance to targeted therapies.In this study, we focused on CDA because CDA-overexpressing cells were present in the naïve cell population and were propagated during ceritinib treatment.Future studies will be required to dissect the potential role of CDA in the EMT and survival of resistant cancer cells during targeted therapy.

Conclusion
Resistance to ALK inhibitors is an urgent problem for ALK fusion-positive lungcancer therapies.We identified cytidine deaminase (CDA) as a potential druggable target to treat ALK inhibitor-resistant NSCLC.Drug-induced epigenetic reprogramming via demethylation of the CDA promoter and enhancer activates CDA transcription as part of acquired resistance.Targeting CDA-directed metabolism using epigenome-related nucleosides such as 5-formyl-2'-deoxycytidine (5fdC) selectively ablates CDA-overexpressing resistant cells via the accumulation of DNA damage.Thus, blockade of CDA-overexpressing cells could be a new strategy to overcome resistance to ALK inhibitors.The highest dotted red line is for a sample with higher expression than the median value for the high-expression cohort; the lowest dotted blue line is for a sample with lower expression than the median value for the low-expression cohort.HR, hazard ratio.C EMT scores in lung adenocarcinoma tumor samples with low (n =2 5 7 ) or high (n =2 5 8 )CDA expression.D Levels of CDA mRNA (left) and protein (right) assessed by qRT-PCR and western blotting of primary cancer cells from ALK -rearranged NSCLC patients with acquired resistance to crizotinib.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).E Immunohistochemical staining of CDA in tumor biopsies from 11 NSCLC patients with ALK rearrangements before and/or after ALK inhibitor therapy.Left: Boxplot depicting CDA H-score distribution in NSCLC patients pre-treated (n =8 )o rp o s t -t r e a t e d( n =8 )w i t hT K I s .R i g h t :R e p r e s e n t a t i v ei m a g e s .F Model for targeting CDA-directed metabolism with 5fdC to overcome resistance to ALK inhibitors.CDA-overexpressing cells were pre-existed in the naïve cell population and

Fig. 2
Fig.2 scRNA-seq analysis of H3122 and LR cells.A Left: t-SNE showing clusters C0-C11 from H3122 and LR.Top right: UAMP plot showing scRNA-seq results for H3122 (n =4 , 8 3 5 )a n dL R( n =4 , 5 6 6 ) .B o t t o mr i g h t :U M A P plot representing scRNA-seq results based on cell-cycle phase.B Top: Proportion of H3122 and LR in each cluster.Bottom: Proportion of H3122 and LR in each cell-cycle phase.(Colors defined in panel A.) C Heatmap of DEGs between H3122 and LR.Filtering is based on cutoff values of |Log2FC| > 0.25 and FDR < 0.001.D Expression ratio of CDA High ,A X L High ,a n dC D A High +A X L High cells in each cluster for H3122 or LR.E Expression of CDA, AXL,a n dDUSP6 assessed by UMAP maps.F Top: Map of CDA from the UCSC Genome Browser.Bottom left: DNA methylation of CDA in H3122 and LR.Pie charts represent the percentage of methylation (black) of the individual Infinium Human Methylation 450K BeadChip probes cg04087271, cg20619374, and cg06984156 (red lines).Total methylation ratios are indicated under each pie chart.Bottom right: Relative mRNA levels of CDA in H3122 and LR.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).Far right: Bisulfite sequencing of CDA CpG sites (cg04087271, cg20619374, and cg06984156) in H3122 and LR.Closed and open circles represent methylated and unmethylated cytosines, respectively.Percentage of methylated cytosines is indicated at bottom.G Violin plots showing CDA and AXL expression in PC9 cells treated with erlotinib for the indicated times.Data were obtained from GSE134839.H UMAP showing clusters C0-C9 from PC9 cells treated for 3d a y sw i t he r l o t i n i b .D a t aw e r eo b t a i n e df r o mG S E 1 4 9 3 8 3 .I Gene interaction map for interactions between CDA and related genes.Fig.3Effects of CDA knockdown on ceritinib-resistant cells.A Western blot analysis of CDA in H3122 and LR cell lysates.B,C Knockdown efficiency of siRNAs as assessed by qRT-PCR (B) and western blotting (C).n =3 independent experiments, mean ± SD, ***P < (Mann-Whitney U test).For western blotting, GAPDH was used as al o a d i n gc o n t r o l .D Proliferation of LR cells transfected with siRNA for the indicated times.n =3i n d e p e n d e n t experiments, mean ± SEM, ***P < 0.001 (unpaired two-sided t test).E Proliferation of H3122 and LR cells treated with tetrahydrouridine (0-20 µM) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, **P < 0.01, ***P < 0.001 (unpaired two-sided t test).F Expression of EMT-related proteins after CDA knockdown.GAPDH was used as a loading control.G Top: Wound healing analysis of CDA-depleted LR cells at 0 and 15 h after the cell surface was scratched.Scale bars, 100 µm.Bottom: Relative cell-covered wound area at 0 and 15 h.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05 (Mann-Whitney U test).H Top: Representative microscopic images of migrating and invading CDA-depleted LR cells 48 h after seeding.Scale bars, 100 µm.Bottom: Relative numbers of migrating and invading cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).I Left: Heatmap showing DEGs in CDA-depleted LR cells.DEGs were filtered based on cutoff values of |Log2FC| > 0.6 and FDR < 0.05.J Relative mRNA expression levels of target genes in CDA-depleted LR cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01, ***P < 0.001 (Mann-Whitney U test).Fig. 4 Integrated analysis of scRNA-seq and scATAC-seq in LR cells.A Left: Integrated scRNA-seq (n =4 , 5 6 6 )a n d scATAC-seq (n = 6,772) described by UMAP.Right: Annotated scATAC-seq clusters based on scRNA-seq.B Left: Calculation and classification of interaction scores.Right: Distribution of different genomic regions in OCR groups.C Genomic Regions Enrichment of Annotations Tool ontology analysis of OCRs.P values shown are Bonferroni adjusted (n.d., not detected).D Transcription-factor motif enrichment analysis of OCRs.The most highly enriched motifs are shown.E ATAC signal for each cluster in the gene b o dy of CDA.O C R sr e g u l a t i n gCDA expression are displayed with the UCSC Genome Browser.F Relative CDA mRNA expression following knockdown of the indicated TFs using siRNAs.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).G Schematic diagram of the regulatory roles of TEAD1, SMAD3, and FOXM1 in CDA activation in ceritinib-resistant cells.Fig. 5 Antiproliferative effects of 5fdC on ceritinib-resistant cells.A Proliferation of H3122 and LR cells treated with 5hmdC (left) or 5fdC (right) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, ***P < 0.001 (unpaired two-sided t test).B Colony-formation assay of H3122 and LR cells treated with DMSO (control) or 5fdC (1 or 10 µM) for 10 days.Left: Representative images.Right: Relative numbers of colonies.n =3 independent experiments, mean ± SD, **P < 0.01, P < 0.001 (Mann-Whitney U test).C Left: Flow-cytometry analysis of annexin V and propidium iodide (PI) staining of LR cells treated with 5fdC (10 µM) for 24 or 48 h.Right: Histogram showing percentages of cells in early apoptosis, late apoptosis, and necrosis.D Immunofluorescence labeling of γ-H2AX (DNA damage) and H3PS10 (cell proliferation) in H3122 and LR cells.Cells were treated with DMSO (control) or 5fdC (10 µM) for 24 h.Nuclear DNA was stained with DAPI (blue).Top: Representative images.Scale bar, 20 µm.Bottom: Fluorescence intensity of H3PS10 and γ-H2AX.n =3 independent experiments, mean ± SEM, *P < 0.05 (Mann-Whitney U test).E In vivo treatment of mice with LR cells and 5fdC.Top left: Schematic diagram of treatment.Top right: Mouse body weight over the course of 5fdC administration.Bottom: Photographs of dissected tumors.Bar plot of dissected tumor weight; mean ± SD (n =5 , Mann-Whitney U test).F Evaluation of proliferation (Ki-67) and DNA damage (γ-H2AX) in dissected tumor samples.Representative images.Scale bar, 50 µm (H&E), 20 µm( i m m u n o fl u o r e s c e n c e ) .Fig. 6 Clinical relevance of CDA expression in NSCLC patients.A CDA expression in lung adenocarcinoma tumor (n =4 8 3 ,f r o mT C G A )a n dn o r m a l( n =3 4 7 ,f r o mG T E x )t i s s u es a m p l e s .E a c hd o tr e p r e s e n t se x p r e s s i o ni nas i n g l e sample.B Overall survival of lung adenocarcinoma patients with low or high CDA expression as analyzed by Kaplan-Meier method and log-rank test.Median values in transcripts per million (TPM) are indicated by solid red and blue lines.The highest dotted red line is for a sample with higher expression than the median value for the high-expression cohort; the lowest dotted blue line is for a sample with lower expression than the median value for the low-expression cohort.HR, hazard ratio.C EMT scores in lung adenocarcinoma tumor samples with low (n =25 7 )   or high (n =2 5 8 )CDA expression.D Levels of CDA mRNA (left) and protein (right) assessed by qRT-PCR and western blotting of primary cancer cells from ALK -rearranged NSCLC patients with acquired resistance to crizotinib.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).E Immunohistochemical staining of CDA in tumor biopsies from 11 NSCLC patients with ALK rearrangements before and/or after ALK inhibitor therapy.Left: Boxplot depicting CDA H-score distribution in NSCLC patients pre-treated (n =8 )o rp o s t -t r e a t e d( n =8 )w i t hT K I s .R i g h t :R e p r e s e n t a t i v ei m a g e s .F Model for targeting CDA-directed metabolism with 5fdC to overcome resistance to ALK inhibitors.CDA-overexpressing cells were pre-existed in the naïve cell population and

Fig. 3
Fig.2 scRNA-seq analysis of H3122 and LR cells.A Left: t-SNE showing clusters C0-C11 from H3122 and LR.Top right: UAMP plot showing scRNA-seq results for H3122 (n =4 , 8 3 5 )a n dL R( n =4 , 5 6 6 ) .B o t t o mr i g h t :U M A P plot representing scRNA-seq results based on cell-cycle phase.B Top: Proportion of H3122 and LR in each cluster.Bottom: Proportion of H3122 and LR in each cell-cycle phase.(Colors defined in panel A.) C Heatmap of DEGs between H3122 and LR.Filtering is based on cutoff values of |Log2FC| > 0.25 and FDR < 0.001.D Expression ratio of CDA High ,A X L High ,a n dC D A High +A X L High cells in each cluster for H3122 or LR.E Expression of CDA, AXL,a n dDUSP6 assessed by UMAP maps.F Top: Map of CDA from the UCSC Genome Browser.Bottom left: DNA methylation of CDA in H3122 and LR.Pie charts represent the percentage of methylation (black) of the individual Infinium Human Methylation 450K BeadChip probes cg04087271, cg20619374, and cg06984156 (red lines).Total methylation ratios are indicated under each pie chart.Bottom right: Relative mRNA levels of CDA in H3122 and LR.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).Far right: Bisulfite sequencing of CDA CpG sites (cg04087271, cg20619374, and cg06984156) in H3122 and LR.Closed and open circles represent methylated and unmethylated cytosines, respectively.Percentage of methylated cytosines is indicated at bottom.G Violin plots showing CDA and AXL expression in PC9 cells treated with erlotinib for the indicated times.Data were obtained from GSE134839.H UMAP showing clusters C0-C9 from PC9 cells treated for 3d a y sw i t he r l o t i n i b .D a t aw e r eo b t a i n e df r o mG S E 1 4 9 3 8 3 .I Gene interaction map for interactions between CDA and related genes.Fig.3Effects of CDA knockdown on ceritinib-resistant cells.A Western blot analysis of CDA in H3122 and LR cell lysates.B,C Knockdown efficiency of siRNAs as assessed by qRT-PCR (B) and western blotting (C).n =3 independent experiments, mean ± SD, ***P < (Mann-Whitney U test).For western blotting, GAPDH was used as al o a d i n gc o n t r o l .D Proliferation of LR cells transfected with siRNA for the indicated times.n =3i n d e p e n d e n t experiments, mean ± SEM, ***P < 0.001 (unpaired two-sided t test).E Proliferation of H3122 and LR cells treated with tetrahydrouridine (0-20 µM) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, **P < 0.01, ***P < 0.001 (unpaired two-sided t test).F Expression of EMT-related proteins after CDA knockdown.GAPDH was used as a loading control.G Top: Wound healing analysis of CDA-depleted LR cells at 0 and 15 h after the cell surface was scratched.Scale bars, 100 µm.Bottom: Relative cell-covered wound area at 0 and 15 h.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05 (Mann-Whitney U test).H Top: Representative microscopic images of migrating and invading CDA-depleted LR cells 48 h after seeding.Scale bars, 100 µm.Bottom: Relative numbers of migrating and invading cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).I Left: Heatmap showing DEGs in CDA-depleted LR cells.DEGs were filtered based on cutoff values of |Log2FC| > 0.6 and FDR < 0.05.J Relative mRNA expression levels of target genes in CDA-depleted LR cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01, ***P < 0.001 (Mann-Whitney U test).Fig. 4 Integrated analysis of scRNA-seq and scATAC-seq in LR cells.A Left: Integrated scRNA-seq (n =4 , 5 6 6 )a n d scATAC-seq (n = 6,772) described by UMAP.Right: Annotated scATAC-seq clusters based on scRNA-seq.B Left: Calculation and classification of interaction scores.Right: Distribution of different genomic regions in OCR groups.C Genomic Regions Enrichment of Annotations Tool ontology analysis of OCRs.P values shown are Bonferroni adjusted (n.d., not detected).D Transcription-factor motif enrichment analysis of OCRs.The most highly enriched motifs are shown.E ATAC signal for each cluster in the gene b o dy of CDA.O C R sr e g u l a t i n gCDA expression are displayed with the UCSC Genome Browser.F Relative CDA mRNA expression following knockdown of the indicated TFs using siRNAs.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).G Schematic diagram of the regulatory roles of TEAD1, SMAD3, and FOXM1 in CDA activation in ceritinib-resistant cells.Fig. 5 Antiproliferative effects of 5fdC on ceritinib-resistant cells.A Proliferation of H3122 and LR cells treated with 5hmdC (left) or 5fdC (right) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, ***P < 0.001 (unpaired two-sided t test).B Colony-formation assay of H3122 and LR cells treated with DMSO (control) or 5fdC (1 or 10 µM) for 10 days.Left: Representative images.Right: Relative numbers of colonies.n =3 independent experiments, mean ± SD, **P < 0.01, P < 0.001 (Mann-Whitney U test).C Left: Flow-cytometry analysis of annexin V and propidium iodide (PI) staining of LR cells treated with 5fdC (10 µM) for 24 or 48 h.Right: Histogram showing percentages of cells in early apoptosis, late apoptosis, and necrosis.D Immunofluorescence labeling of γ-H2AX (DNA damage) and H3PS10 (cell proliferation) in H3122 and LR cells.Cells were treated with DMSO (control) or 5fdC (10 µM) for 24 h.Nuclear DNA was stained with DAPI (blue).Top: Representative images.Scale bar, 20 µm.Bottom: Fluorescence intensity of H3PS10 and γ-H2AX.n =3 independent experiments, mean ± SEM, *P < 0.05 (Mann-Whitney U test).E In vivo treatment of mice with LR cells and 5fdC.Top left: Schematic diagram of treatment.Top right: Mouse body weight over the course of 5fdC administration.Bottom: Photographs of dissected tumors.Bar plot of dissected tumor weight; mean ± SD (n =5 , Mann-Whitney U test).F Evaluation of proliferation (Ki-67) and DNA damage (γ-H2AX) in dissected tumor samples.Representative images.Scale bar, 50 µm (H&E), 20 µm( i m m u n o fl u o r e s c e n c e ) .Fig. 6 Clinical relevance of CDA expression in NSCLC patients.A CDA expression in lung adenocarcinoma tumor (n =4 8 3 ,f r o mT C G A )a n dn o r m a l( n =3 4 7 ,f r o mG T E x )t i s s u es a m p l e s .E a c hd o tr e p r e s e n t se x p r e s s i o ni nas i n g l e sample.B Overall survival of lung adenocarcinoma patients with low or high CDA expression as analyzed by Kaplan-Meier method and log-rank test.Median values in transcripts per million (TPM) are indicated by solid red and blue lines.The highest dotted red line is for a sample with higher expression than the median value for the high-expression cohort; the lowest dotted blue line is for a sample with lower expression than the median value for the low-expression cohort.HR, hazard ratio.C EMT scores in lung adenocarcinoma tumor samples with low (n =25 7 )   or high (n =2 5 8 )CDA expression.D Levels of CDA mRNA (left) and protein (right) assessed by qRT-PCR and western blotting of primary cancer cells from ALK -rearranged NSCLC patients with acquired resistance to crizotinib.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).E Immunohistochemical staining of CDA in tumor biopsies from 11 NSCLC patients with ALK rearrangements before and/or after ALK inhibitor therapy.Left: Boxplot depicting CDA H-score distribution in NSCLC patients pre-treated (n =8 )o rp o s t -t r e a t e d( n =8 )w i t hT K I s .R i g h t :R e p r e s e n t a t i v ei m a g e s .F Model for targeting CDA-directed metabolism with 5fdC to overcome resistance to ALK inhibitors.CDA-overexpressing cells were pre-existed in the naïve cell population and

Fig. 6
Fig.2 scRNA-seq analysis of H3122 and LR cells.A Left: t-SNE showing clusters C0-C11 from H3122 and LR.Top right: UAMP plot showing scRNA-seq results for H3122 (n =4 , 8 3 5 )a n dL R( n =4 , 5 6 6 ) .B o t t o mr i g h t :U M A P plot representing scRNA-seq results based on cell-cycle phase.B Top: Proportion of H3122 and LR in each cluster.Bottom: Proportion of H3122 and LR in each cell-cycle phase.(Colors defined in panel A.) C Heatmap of DEGs between H3122 and LR.Filtering is based on cutoff values of |Log2FC| > 0.25 and FDR < 0.001.D Expression ratio of CDA High ,A X L High ,a n dC D A High +A X L High cells in each cluster for H3122 or LR.E Expression of CDA, AXL,a n dDUSP6 assessed by UMAP maps.F Top: Map of CDA from the UCSC Genome Browser.Bottom left: DNA methylation of CDA in H3122 and LR.Pie charts represent the percentage of methylation (black) of the individual Infinium Human Methylation 450K BeadChip probes cg04087271, cg20619374, and cg06984156 (red lines).Total methylation ratios are indicated under each pie chart.Bottom right: Relative mRNA levels of CDA in H3122 and LR.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).Far right: Bisulfite sequencing of CDA CpG sites (cg04087271, cg20619374, and cg06984156) in H3122 and LR.Closed and open circles represent methylated and unmethylated cytosines, respectively.Percentage of methylated cytosines is indicated at bottom.G Violin plots showing CDA and AXL expression in PC9 cells treated with erlotinib for the indicated times.Data were obtained from GSE134839.H UMAP showing clusters C0-C9 from PC9 cells treated for 3d a y sw i t he r l o t i n i b .D a t aw e r eo b t a i n e df r o mG S E 1 4 9 3 8 3 .I Gene interaction map for interactions between CDA and related genes.Fig.3Effects of CDA knockdown on ceritinib-resistant cells.A Western blot analysis of CDA in H3122 and LR cell lysates.B,C Knockdown efficiency of siRNAs as assessed by qRT-PCR (B) and western blotting (C).n =3 independent experiments, mean ± SD, ***P < (Mann-Whitney U test).For western blotting, GAPDH was used as al o a d i n gc o n t r o l .D Proliferation of LR cells transfected with siRNA for the indicated times.n =3i n d e p e n d e n t experiments, mean ± SEM, ***P < 0.001 (unpaired two-sided t test).E Proliferation of H3122 and LR cells treated with tetrahydrouridine (0-20 µM) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, **P < 0.01, ***P < 0.001 (unpaired two-sided t test).F Expression of EMT-related proteins after CDA knockdown.GAPDH was used as a loading control.G Top: Wound healing analysis of CDA-depleted LR cells at 0 and 15 h after the cell surface was scratched.Scale bars, 100 µm.Bottom: Relative cell-covered wound area at 0 and 15 h.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05 (Mann-Whitney U test).H Top: Representative microscopic images of migrating and invading CDA-depleted LR cells 48 h after seeding.Scale bars, 100 µm.Bottom: Relative numbers of migrating and invading cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).I Left: Heatmap showing DEGs in CDA-depleted LR cells.DEGs were filtered based on cutoff values of |Log2FC| > 0.6 and FDR < 0.05.J Relative mRNA expression levels of target genes in CDA-depleted LR cells.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01, ***P < 0.001 (Mann-Whitney U test).Fig. 4 Integrated analysis of scRNA-seq and scATAC-seq in LR cells.A Left: Integrated scRNA-seq (n =4 , 5 6 6 )a n d scATAC-seq (n = 6,772) described by UMAP.Right: Annotated scATAC-seq clusters based on scRNA-seq.B Left: Calculation and classification of interaction scores.Right: Distribution of different genomic regions in OCR groups.C Genomic Regions Enrichment of Annotations Tool ontology analysis of OCRs.P values shown are Bonferroni adjusted (n.d., not detected).D Transcription-factor motif enrichment analysis of OCRs.The most highly enriched motifs are shown.E ATAC signal for each cluster in the gene b o dy of CDA.O C R sr e g u l a t i n gCDA expression are displayed with the UCSC Genome Browser.F Relative CDA mRNA expression following knockdown of the indicated TFs using siRNAs.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, *P < 0.05, **P < 0.01 (Mann-Whitney U test).G Schematic diagram of the regulatory roles of TEAD1, SMAD3, and FOXM1 in CDA activation in ceritinib-resistant cells.Fig. 5 Antiproliferative effects of 5fdC on ceritinib-resistant cells.A Proliferation of H3122 and LR cells treated with 5hmdC (left) or 5fdC (right) for the indicated times.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SEM, *P < 0.05, ***P < 0.001 (unpaired two-sided t test).B Colony-formation assay of H3122 and LR cells treated with DMSO (control) or 5fdC (1 or 10 µM) for 10 days.Left: Representative images.Right: Relative numbers of colonies.n =3 independent experiments, mean ± SD, **P < 0.01, P < 0.001 (Mann-Whitney U test).C Left: Flow-cytometry analysis of annexin V and propidium iodide (PI) staining of LR cells treated with 5fdC (10 µM) for 24 or 48 h.Right: Histogram showing percentages of cells in early apoptosis, late apoptosis, and necrosis.D Immunofluorescence labeling of γ-H2AX (DNA damage) and H3PS10 (cell proliferation) in H3122 and LR cells.Cells were treated with DMSO (control) or 5fdC (10 µM) for 24 h.Nuclear DNA was stained with DAPI (blue).Top: Representative images.Scale bar, 20 µm.Bottom: Fluorescence intensity of H3PS10 and γ-H2AX.n =3 independent experiments, mean ± SEM, *P < 0.05 (Mann-Whitney U test).E In vivo treatment of mice with LR cells and 5fdC.Top left: Schematic diagram of treatment.Top right: Mouse body weight over the course of 5fdC administration.Bottom: Photographs of dissected tumors.Bar plot of dissected tumor weight; mean ± SD (n =5 , Mann-Whitney U test).F Evaluation of proliferation (Ki-67) and DNA damage (γ-H2AX) in dissected tumor samples.Representative images.Scale bar, 50 µm (H&E), 20 µm( i m m u n o fl u o r e s c e n c e ) .Fig. 6 Clinical relevance of CDA expression in NSCLC patients.A CDA expression in lung adenocarcinoma tumor (n =4 8 3 ,f r o mT C G A )a n dn o r m a l( n =3 4 7 ,f r o mG T E x )t i s s u es a m p l e s .E a c hd o tr e p r e s e n t se x p r e s s i o ni nas i n g l e sample.B Overall survival of lung adenocarcinoma patients with low or high CDA expression as analyzed by Kaplan-Meier method and log-rank test.Median values in transcripts per million (TPM) are indicated by solid red and blue lines.The highest dotted red line is for a sample with higher expression than the median value for the high-expression cohort; the lowest dotted blue line is for a sample with lower expression than the median value for the low-expression cohort.HR, hazard ratio.C EMT scores in lung adenocarcinoma tumor samples with low (n =25 7 )   or high (n =2 5 8 )CDA expression.D Levels of CDA mRNA (left) and protein (right) assessed by qRT-PCR and western blotting of primary cancer cells from ALK -rearranged NSCLC patients with acquired resistance to crizotinib.n =3i n d e p e n d e n te x p e r i m e n t s ,m e a n± SD, **P < 0.01 (Mann-Whitney U test).E Immunohistochemical staining of CDA in tumor biopsies from 11 NSCLC patients with ALK rearrangements before and/or after ALK inhibitor therapy.Left: Boxplot depicting CDA H-score distribution in NSCLC patients pre-treated (n =8 )o rp o s t -t r e a t e d( n =8 )w i t hT K I s .R i g h t :R e p r e s e n t a t i v ei m a g e s .F Model for targeting CDA-directed metabolism with 5fdC to overcome resistance to ALK inhibitors.CDA-overexpressing cells were pre-existed in the naïve cell population and

Figures Figure 1
Figures

Figure 2 See
Figure 2

Figure 3 See
Figure 3

Figure 4 See
Figure 4

Figure 5 See
Figure 5

Figure 6 See
Figure 6