Genomic landscapes reveal post-transcriptional modier disruption in cholangiocarcinoma

Molecular variation of different geographic populations and cholangiocarcinoma (CCA) subtypes indicate CCA’s potential genomic heterogeneity and novel genomic features. We analyzed exome-sequencing data of 87 perihialr cholangiocarcinoma (pCCA) and 261 intrahepatic cholangiocarcinoma (iCCA) from 3 Asian centers (including 43 pCCAs and 24 iCCAs from our center). Patients with iCCA presented higher tumor mutation burden and copy number alteration burden (CNAB) than pCCA patients, and CNAB indicated a poorer pCCA prognosis. In our newly identied 12 signicantly mutated genes and 5 focal CNA regions, post-transcriptional modication-related novel driver genes METTL14 and RBM10 are prone to occur in pCCA. We demonstrated the tumor-suppressing functional role of METTL14, a major RNA N6-adenosine methyltransferase (m6A), and its loss-of-function mutation R298H may function through m6A modication on new driver gene MACF1. Our results may be valuable for better understanding of post-transcriptional modication effects CCA development, and similarities and differences between pCCA and iCCA. loss pattern. Germline mutations in BRCA2, which encodes a BRCA-associated protein, were reported in familial CCA cases 47, 48 . Our study also suggested that BRCA2 could be affected by somatic mutations. MLLT4, also known as AF6, commonly fused with KMT2A in acute myeloid leukemia. It has also been identied as a SMGs in breast cancers 49 . Few genomic alterations has ever been reported in NACC1, but it is activated in ovarian serous carcinomas and inuences cell apoptosis, senescence, and cytokinesis in cancer cells 50, 51 . Integrated whole-exome analysis of Asian CCAs identied new CCA driver genes and emphasized the similarities and distinctions genomic characteristics between pCCA and iCCA. Importantly, our study also highlighted the effect of underlying post-transcriptional modication in CCA occurrence and development. These results provide a better understanding of the CCA mutational landscape that may drive improvements in clinical practice.


Introduction
Cholangiocarcinoma (CCA) is a malignant primary hepatobiliary disease originating from every point of the biliary tree, from the canals of Hering to the main bile duct 1 . In recent decades, the incidence and mortality of CCA have increased globally, especially in East Asia 2, 3 . In China, the incidence of CCA reached 6-7.55 per 100,000 individuals 4,5 . The mortality of CCA is almost equal to its incidence due to resistance to common treatments and poor prognosis 6 . The overall ve-year survival rate is only 10% 7 .
Even for the patients with CCA who are amenable to surgical resection, and the corresponding ve-year survival rate is only 25-35% 8 , and high recurrence rates of 50 to 60% persist upon diagnosis, with no effective therapy.
CCA is regarded as a group of different diseases that are further divided into intrahepatic CCA (iCCA), perihilar CCA (pCCA), and distal CCA (dCCA) based on anatomical location 9 . pCCA is the most frequent subtype, accounting for 50-60% of all CCA tumors 9 . In general, the incidence and mortality rates of iCCA was reported rising worldwide in the past decade, whereas the incidences of pCCA seem to be stable or decreasing 10 . Besides the burden of HCV infection has been linked with the incidence of iCCA, infection with liver ukes and HBV-related liver diseases have shown a stronger association for iCCA in East Asia 10,11 . Primary sclerosing cholangitis (PSC) is considered as a predominant risk factor for CCA, especially for pCCA 10,12 . The pronounced geographical and etiology heterogeneity of iCCA and pCCA indicated the potential diverse cancer-initiating cell in these two subtypes of CCA 13 . In detail, iCCA is considered to originate two different cell of origin (mucin-secreting cholangiocyte or hepatic progenitor cell), due to the signi cant inter-tumor heterogeneity 14 . Unlike iCCA, pCCA only originate from mucin-secreting cholangiocyte 13,15 .
To date, previous studies of the genomic alterations in a variety of bile duct cancer, especially for iCCA, have demonstrated some known commonalities in mutation of CCA. These studies identi ed a number of driver genes, but most of them represented the genomic features of iCCA, such as TP53, KRAS, SMAD4, and ARID1A [16][17][18][19][20] . Notably, the latest integrative genomic analysis of Caucasian extrahepatic CCA (eCCA) (including the pCCA and dCCA) revealed distinct molecular characterization of eCCA in western 21 . However, the epidemiological pro le of CCA and its subtypes shows enormous geographical variation, indicating underlying genomic heterogeneity across different regions 5 . The prevalence of CCA is highest in East Asia and the subtype distribution is similar. Thus, integration analysis in sequencing data from Asian countries may nd new genomic features ignored by previous studies. Here, we performed wholeexome sequencing on 43 pCCA and 24 iCCA, and analyzed an additional 44 pCCA and 237 CCA to investigate the genomic landscapes of pCCA and iCCA, respectively.

Methods And Materials
Detailed information about the clinical sample collection and follow-up; DNA preparation, DNA capture, and sequencing; Next-generation sequencing; Sequencing alignment and detection of somatic variants; Mutational burden and signature analysis; Identi cation of CCA driver genes and comparison between the iCCA and pCCA; Somatic copy number estimation; Highly ampli ed/deleted regions identi cation;  Table 2) in all the cases, and the median number per case was 120.5. The median tumor mutation burden (TMB) was 2.0 megabase per case (/Mb) and the burden was comparable between pCCA (2.0/Mb) and iCCA (2.0/Mb) (Wilcoxon rank-sum test P = 0.48, Supplementary Fig. 1A). Among these mutations, we de ned 28,254 non-synonymous mutations, for which the median was 33 Single Nucleotide Variants (SNVs) and 3 small insertion and deletions (INDELs) per patient. Interestingly, the non-synonymous mutation burden (median 0.61/Mb) of iCCA was signi cantly higher than that (median 0.47/Mb) of pCCA (Wilcoxon ranksum test P = 9.6×10 -4 , Fig.1A). Similar to the non-synonymous mutation pattern, iCCA had a higher copy number alteration burden (CNAB) than pCCA (iCCA median = 25.0%, pCCA median = 8.4%, Wilcoxon ranksum test P = 2.4×10 -4 ), no matter the ampli cation or deletion (Fig. 1C).
Next, we investigated the genomic mutational signatures, which provide clues for the carcinogenesis of CCA. Firstly, we classi ed all the mutations into six classic subtypes. Consistent with previous studies 17, 18  signatures associated with activated APOBECs (Signatures 2 and 13, 6.48%), and signature 8 (6.41%) were dominant (proportion > 5%) in pCCA patients (Fig. 1B). For iCCA patients, COSMIC signature 1 (27.19%) and signature MMR (26.74%) were also identi ed as dominant, but the proportion was slightly lower than in pCCA patients. In addition, signature 4 related to smoking behavior (10.13%), signature 22 (9.96%) related to aristolochic acid, and signatures speci c to liver cancer (Liver, Signatures 12, 16, and 24, 8.7%) were predominant in iCCA. Interestingly, signature Liver were more common in HBV-related iCCA (HBV: 26.0%, non-HBV: 3.8%, Supplementary Table 3). Aristolochic acid-related mutations were mainly found in iCCA patients from Zou et al.'s study, but this signature was also observed in pCCA patients (Supplementary Table 4). In addition, signature 3, related to homologous recombination de ciency (4.26%), signature 10, related to altered POLE activity (2.44%), and signature 7, related to ultraviolet (UV) light exposure (2.17%), were only identi ed in iCCA (Fig. 1B), which suggested that multiple types of DNA damage occurred during the carcinogenesis progress of CCA.
The results mentioned above revealed the nature of genomic instability in CCA patients. We further investigated the association between TMB/CNAB and the overall survival of patients. We found that the patients carrying higher loads of mutations showed worse outcomes after adjusting for age, sex, and stage at diagnosis (Hazard ratio (HR)= 1.71, P = 3.9 × 10 -4 , Fig. 1D). The association was consistent in pCCA and iCCA patients ( Supplementary Fig. 1C). When examining CNAB, we noticed that the survival of pCCA patients with higher CNAB was signi cantly worse than for patients with lower CNAB (HR = 2.70, P = 3.6 × 10 -2 , Fig. 1E); However, there was no signi cant difference between iCCA patients with distinct CNAB (HR = 0.91, P=6.7 × 10 -1 , Fig. 1F).

New driver genes
We identi ed 36 signi cantly mutated genes (SMGs, Fig. 2), also described as mut-drivers, in all patients by MutSig2CV and IntOGen, including 24 reported SMGs and 12 new SMGs not highlighted in previous CCA publications (Supplementary Table 5). Among these new SMGs, MACF1 (5.17%, 18/348) and AXIN1 (0.86%, 3/348, all of the three patients were with iCCA), which is commonly considered as one of drivers in hepatocellular carcinoma (HCC), are two interactive components of the β-catenin/Wnt signaling pathway (Fig. 2) . Although the mutation rate of AXIN1 was low, we noticed that there were multiple nonsynonymous mutations in two iCCA patients (Fig. 2). The mutations of PIK3R1, a component of the PI3K pathway, was displayed in 2.59% (9/348) of patients. Post-transcriptional modi cation genes RBM10 (3.16%, 11/348) and METTL14 (0.86%, 3/348) act as well-known alternative splicing regulators and m 6 A writers. Importantly, we identi ed recurrent mutations (Supplementary Methods) in METTL14, and all mutations in this gene affected the same amino acid (Supplementary Fig. 2A), which were also predicted as potentially deleterious variants by several bioinformatic pathogenicity prediction tools (Supplementary Table 5). We observed the same mutation rate (1.44%, 5/348) of chromatin modi ers SMARCA4 and WHSC1. Interestingly, mutations in these genes were prone to co-occur in the same patient (Fisher's exact test OR = 660, P = 2.29 × 10 -6 , Fig. 2). In addition, classic cancer driver genes identi ed in other cancers were also identi ed as SMGs in this study, including ATM (3.45%, 12/348), BRCA2 (2.01%, 7/348), and MLLT4 (2.01%, 7/348). Although BRCA2, EPHA2 and ATM were mentioned in previously published work due to their same position recurrent inactivating mutations, our integrated analysis with more sample size rstly identi ed them as SMGs 18 .
Next, we performed pathway enrichment analysis on all driver genes identi ed in this study, which included SMGs (mut-drivers) and cancer genes in the frequently altered focal CNA regions (CNA-drivers) described above, and found that the driver genes of CCA were signi cantly enriched in the RTK-RAS, Wnt, PI3K, Cell Cycle, TP53, TGF-beta, and HIPPO pathways (Fig.3C). Including the newly identi ed MET ampli cation, 34.3% of CCA patients harbored mutations and CNAs in oncogenes from the RTK-RAS pathway, and these alterations occurred in a mutually exclusive manner, as reported in other cancers (Fig.  3D).

Functional recurrent mutations in METTL14
Among the potential driver genes mentioned above, we identi ed a new potential driver gene, METTL14, the main factor involved in aberrant m6A modi cation of various cancers, with recurrent and deleterious mutations 26,27 (Supplementary Fig. 2A, Supplementary Table 7). All three mutations affected the same 298 amino acid residue (METTL14 p.R298H and p.R298C) and two of them were in the same genomic position (two patients from xMU and ICGC respectively) (Supplementary Fig. 2A). We further conducted sanger sequencing in an independent pCCA cohort with extra 40 subjects from xMU cohort and identi ed an additional p.R298H carrier ( Supplementary Fig. 2B & C). In the COSMIC database, we also found that the same p.R298H also occurred in three pancreatic ductal carcinomas (Supplementary Fig. 2A) and all the three patients were East Asian.
Crystal structures of the METTL3-METTL14 complex have revealed that p.R298 lies close to the putative RNA-binding groove of the complex which may have a complex role to affect methylation activity 28 . The recurrent and deletions mutant p.R298H suggested us its possibly positive selection and the need of METTL14's normal action in antitumor progress. However, the relevance of this hotspot mutation and m 6 A mRNA methylation to the CCA has not yet been established. Dysfunction of METTL14, the key catalytic protein forming the core m 6 A methyltransferase complex, has shown fundamental biological effects in cancer initiation and progression 28,29 . Thus, we hypothesized that CCA could be associated with METTL14 that regulated m 6 A mRNA methylation. Hence, we rst examined METTL14 expression in 69 CCA pairs and matched adjacent normal tissues. The results of qRT-PCR and western blot revealed that expression of METTL14 was signi cantly downregulated in tumors (P < 0.01; Fig. 4A & B). The immunohistochemistry (IHC) staining on tissue microarray also con rmed that METTL14 staining was decreased in CCA at protein level ( Supplementary Fig.3A) and its downregulation displayed a marginal signi cant association with poor cancer-speci c survival in CCA (Log-Rank P = 0.08; Supplementary   Fig.3B). Consistently, we found that the m 6 A level of total RNA was signi cantly decreased in CCA tissues (Fig. 4C). Interestingly, we also observed the m 6 A modi cation level was signi cantly decreased in METTL14 low expression CCA group compared with METTL14 high expression CCA group ( Supplementary Fig. 3C). These results suggested that METTL14 and it mediated m 6 A modi cation were frequently down-regulated or disturbed in CCA.
To determine the "driver" role of METTL14 and R298H mutations during CCA development, we rst con rmed the transfection e ciency of lentiviral constructs expressing METTL14 wt and METTL14 R298H in RBE and HCCC9810 cell lines ( Fig. 4D and Supplementary Fig. 3D&E). In contrast to the cells stably overexpressing METTL14 wt , cells overexpressing METTL14 R298H showed signi cantly decreased overall m 6 A modi cation ability ( Fig. 4E and Supplementary Fig. 3F). Subsequently, while overexpression of wildtype METTL14 decreased cell proliferation, overexpression of the mutation had no noticeable effect on cell proliferation (Fig. 4F, G & H and Supplementary Fig. 3G, H&K). In addition, overexpression of METTL14 wt resulted in a decrease in cell migration and invasion, and METTL14 R298H remarkably reversed the gene's ability to inhibit migration and invasion in CCA cells (Fig. 4I & Supplementary Fig. 3I & J).
Taken together, these results provided us evidence for loss of function caused by METTL14 R298H mutation, suggesting that METTL14 R298H mutation could reduce the tumor-suppressing effect of METTL14 wt .
New driver gene MACF1 served as the target of METTL14 We then conducted RNA-Seq and MeRIP-Seq assays in negative control, METTL14 wt , and METTL14 R298H cells. A total of 2,601 peaks involving classic transcripts of 1,586 genes were robustly identi ed by exomePeak2 in all cells. The most common m 6 A motif, GGAC, was signi cantly enriched in the m 6 A peaks identi ed ( Supplementary Fig. 4A), and the m 6 A peaks were especially enriched in the vicinity y of the stop codon. Next, we performed differential m 6 A-methylation analysis between control and METTL14 wt cells, as well as between METTL14 wt and METTL14 R298H cells. Because of the writer role of METTL14 during the m 6 A methylation modi cation, we included only m 6 A peaks (712) with increased abundance in METTL14 wt cells as compared to control cells, and peaks (990) with decreased abundance in METTL14 R298H as compared to METTL14 wt cells. A total of 237 peaks were shared by both analyses, including four peaks on the two driver genes mentioned above (MACF1, MET) (Fig. 5A & B). MACF1 was also the new driver gene identi ed in this study (Supplementary Table 5). We then veri ed whether m 6 Amodi ed MACF1 was susceptible to decay. The lifetime of MACF1 was prolonged in METTL14 R298H cells and shortened in METTL14 wt cells after actinomycin D treatment (Fig. 5C). We further immunoprecipitated m 6 A from the RNAs of METTL14 wt and METTL14 R298H cells and found that METTL14 R298H signi cantly decreased the amount of MACF1 modi ed by m 6 A compared to METTL14 wt (Fig. 5D). Because MACF1 was a very large gene (involving 22Kb of exons), we applied immuno uorescence assays to further con rm that METTL14 wt mediated MACF1 degradation, and METTL14 R298H showed increase expression of MACF1 compared to METTL14 wt (Fig. 5E). We also found that METTL14 wt cells decreased the expression of MACF1, and no noticeable effect on MACF1 expression was observed in METTL14 R298H cells using qRT-PCR ( Supplementary Fig. 4B).
When we successfully transfected CCA cells with siRNA pools targeting MACF1 ( Supplementary Fig. 4D), signi cant inhibition of tumor metastasis and proliferation was observed (Fig. 5F, G, H, & I, and Supplementary Fig. 4E & F). Recent studies reported that MACF1 was involved in tumor metastasis and cytoskeleton, and that it played crucial roles in the nucleus translocation of β-catenin 30,31 . In our ndings, we rst validated MACF1 upregulation in CCA using qRT-PCR ( Supplementary Fig. 4C), and found that knockdown of MACF1 signi cantly reduced CCA cell proliferation and metastasis in vitro. Given the essential role of MACF1 in regulating the nucleus translocation of β-catenin, immuno uorescence assays showed that the increase of nuclear β-catenin was correlated with the expression of METTL14 R298H rather than METTL14 wt (Fig. 5J & K). To ascertain the role of MACF1 in METTL14-mediated nucleus translocation of β-catenin, we transfected MACF1 siRNA in METTL14 R298Hoverexpressing cells, and observed that the nucleus translocation of β-catenin was decreased compared to only METTL14 R298H -overexpressing cells (Supplementary Fig. 4G).
Additionally, METTL14 R298H resulted in reversing the expression level of E-cadherin, N-cadherin, PCNA, and CyclinD1, which were the downstream targets of β-catenin, relevant to METTL14 wt (Fig. 5K). These results implied that METTL14-mediated m 6 A modi cation repressed the MACF1/β-catenin pathway in CCA, while METTL14 R298H mutation disrupted this mechanism.

Discussion
In this study, we included genomic data from 348 CCA samples (including 87 from pCCA and 261 from iCCA) to investigate the genomic landscapes of both pCCA and iCCA. We found the shared and distinct features between these two anatomical subtypes. Currently, there is no effective biomarker that can accurately predict the prognosis of CCA patients 32,33 . Our study found that TMB has a good predictive effect on the prognosis of East-Asian CCA patients. It is worth noting that pCCA patients with higher CNAB showed worse survival outcomes compared to those with lower CNAB. These ndings suggest that we can explore prognostic markers of patients with CCA from a more macroscopic genetic perspective.
In addition to several new driver genes in classic pathways, we identi ed driver genes participating in post-transcriptional modi cation (i.e., METTL14 and RBM10), which carried more mutations in pCCA patients. With in vitro experiments, we con rmed the role of METTL14, as well as the role of this mutation in CCA development. We further determined that m 6 A modi cation level and expression of new driver gene MACF1 could be regulated by METTL14, which can in uence the proliferation and metastasis ability of CCA cells.
The previous study showed that there was lack of evidence that anatomical site determines molecular subtypes according to integration analysis of transcriptomic and methylation data 34 . Our study identi ed four new driver genes that were related to chromatin and methylation modi cation (i.e., mut-drivers: SMARCA4 and WHSC1; CNA-drivers: DNMT3A and EZH2), further supporting the nding that driver genes cause disruption at in the transcriptional level. Consistent with the previous study's ndings, we found comparable frequency between these alterations of driver genes. However, our analysis rst identi ed two genes as drivers (METTL14 and RBM10) and found that they are related to post-transcription modi cation. Mutations in these two genes were more common in pCCA patients. This evidence suggested that the investigation of the difference between pCCA and iCCA should be extended to the post-transcriptional area. METTL14 engages in m 6 A modi cation 35,36 , which modulates alternative splicing, export, stability, and translation of mRNA 37 . Interestingly, we found that two new driver genes (MACF1 and MET) could be modulated by the m 6 A alteration caused by METTL14 mutations. It was worthy of note that all of these genes had a higher mutation rate or copy number altered rate in iCCA than in pCCA. Thus, they were crucial driver genes in both iCCA and pCCA. However, they could be activated through different mechanisms. Although MACF1 and MET could serve as shared therapeutic targets for pCCA and iCCA, we should not ignore the fact that METTL14 could modulate m 6 A modi cation of broader genes other than the two drivers, which may lead to unexpected drug resistance or side effects, and therefore, further investigation in pCCA patients is warranted. Another gene, RMB10, is an RNA binding protein and alternative splicing regulator frequently mutated in lung adenocarcinomas 38 . Similar to lung cancer, the majority of RBM10 mutations (63.6%) in CCA truncate the protein and may alter downstream splicing of speci c genes 38,39 . Thus, the post-transcriptional events in CCA could be highly disturbed and further study may help determine new molecular subtypes and optimize therapeutic strategy in the clinical setting.
Our study also identi ed a number of new driver genes, involving multiple important pathways. MACF1 and AXIN1 were rst identi ed as SMGs of CCA. They are involved in a complex that also contained CTNNB1, GSK3B, and APC, and can contribute to the activation of β-catenin/Wnt signaling pathways 30,40 . Consistently, the mutations in the two genes were mutually exclusive. CNA-driver MET belongs to the classic RTK-RAS signaling pathway, and its inhibitor has been widely used in other cancers 41,42 . However, MET ampli cation has never been reported in a genomic study of pCCA and reported with rare ampli cation frequency in Caucasion iCCA 23 , suggesting that it may occur more frequently in Chinese patients. According to OncoKB, nearly 106 (30.5%) of all CCA patients carried actionable alterations in the RTK-RAS signaling pathway. PIK3R1, phosphatidylinositol 3-kinase regulatory subunit alpha, is the predominant regulatory isoform of PI3K 43 and is frequently mutated in multiple cancers 44,45 . ATM served as the activator of the TP53 tumor suppressor protein and somatic ATM mutations or deletions that are commonly found in lymphoid malignancies, pancreatic cancer, and lung adenocarcinoma 46 . Notably, our results suggested that ATM could be affected by both truncating mutations and deletion, resulting in the bi-allelic inactivation. Several known tumor suppressors, such as TP53, SMAD4, PTEN, and CDKN2A, showed a similar bi-allelic loss pattern. Germline mutations in BRCA2, which encodes a BRCA-associated protein, were reported in familial CCA cases 47,48 . Our study also suggested that BRCA2 could be affected by somatic mutations. MLLT4, also known as AF6, commonly fused with KMT2A in acute myeloid leukemia. It has also been identi ed as a SMGs in breast cancers 49 . Few genomic alterations has ever been reported in NACC1, but it is activated in ovarian serous carcinomas and in uences cell apoptosis, senescence, and cytokinesis in cancer cells 50,51 . Integrated whole-exome analysis of Asian CCAs identi ed new CCA driver genes and emphasized the similarities and distinctions genomic characteristics between pCCA and iCCA. Importantly, our study also highlighted the effect of underlying post-transcriptional modi cation in CCA occurrence and development. These results provide a better understanding of the CCA mutational landscape that may drive improvements in clinical practice.