Whole-exome sequencing detection of the somatic mutations associated with the tumorigenesis and gefitinib-response of mucoepidermoid carcinomas


 Background: We performed whole-exome sequencing (WES) on the sputum and blood samples of a MEC patient exploring the genetic alternations underlying the mechanism of mucoepidermoid carcinomas (MEC) and gefitinib response.Methods: We previously reported a 10-year old MEC patient who was cured after a complete response to gefitinib treatment. Whole-exome sequencing (WES) was performed on the samples of this patient to detect somatic mutations. Detected genes harboring somatic mutations were compared with previously reported mutant genes related to MEC.Results: Somatic mutations were detected in 13 previously reported oncogene and tumor suppressors, and enriched in apoptosis (RIPK1, SPTA1, and ACTG1). The loss and gain of phosphorylation amino acids occurred in 8 of the 34 non-synonymous mutations, which resided in ARL6, DNAH11, PGM5, PRAMEF15, RALGAPB, RANBP2, TTN, and UBN1. TTN bared two Ala to Thr mutations. Among the 50 genes containing detected somatic mutations, ADAM28, DYSF, GP2, PPP2R5B, and TTN were also detected in a previous study; and all of these overlaps were identified in low and intermediate grade samples.Conclusions: These findings underline the possibility of the accumulated somatic mutations in the tumor suppressor genes and oncogenes might contribute to the tumorigenesis of our MEC patient, which have potential applications for the therapies of MEC.

Mapping and variant analysis: Adaptors were removed from raw reads using cutadapt (version 1.7.1) at first, then reads were processed with FASTX Toolkit (version 0.0.14) for trimming low quality bases (qualities < 20) and removing low quality reads (< 70% of read length with qualities < 20). Then Ncontaining reads were trimmed from N base. High-quality reads longer than 16nt were aligned to the human genome (GRCh38) using BWA-MEM v 0.7.10-r789 (

Somatic mutations detection
Somatic mutations were identified using GATK Mutect2 for identification of mutations in matched tumor and normal samples. A mutation in the tumor was identified as a candidate somatic mutation only when (i) distinct paired reads contained the mutation in the tumor; (ii) the number of distinct paired reads containing a particular mutation in the tumor was at least 10% of read pairs for exome; (iii) the base depth of mutation was at least 10; and (iv) the position was covered in both the tumor and normal. Mutations arising from misplaced genome alignments, including paralogous sequences, were identified and excluded by searching the reference genome. Candidate somatic mutations were further filtered based on gene annotation to identify those occurring in protein-coding regions.

Copy Number Analysis
Copy Number Analysis from the exome data was performed using the program Control-FREEC (Boeva, Popova et al. 2012).
Functional enrichment analysis: To sort out functional categories of genes harboring somatic mutations, Gene Ontology (GO) terms and KEGG pathways were identified using KOBAS 2.0 server (Xie, Mao et al. 2011). Hypergeometric test and Benjamini-Hochberg FDR controlling procedure were used to define the enrichment of each term. Oncogenes and tumor suppressors for different cancer types were referred to the CancerMine database (http://bionlp.bcgsc.ca/cancermine/). All statistical analyses were carried out with two-sided tests with statistical significance level set at P value of 0.05.

Results
WES analysis of sequence nucleotide polymorphism and variations in a low grade MEC patient completely responding gefitinib To study the somatic mutations related to the tumorigenesis and gefitinib response of MEC, we obtained whole-exome sequencing data from the frozen fresh sputum and whole blood samples from a low grad MEC patient (Li, Zhang et al. 2017). This 10-year patient was admitted to our hospital in 2012 and successfully responded to the gefitinib treatment. No relapse has been observed until now.
A total of 46,979,124 and 47,415,894 sequencing reads were generated for the sputum and blood samples, respectively. Over 97% of the targeted exon regions was covered for both samples, and 74% and 76% of targeted bases showed > 10-fold coverage for in the sputum and whole blood samples, respectively (Table S1).
Non-synonymous somatic mutations occur in tumor suppressors with a higher frequency than in oncogenes Using the criteria described in Methods section, we identified 53 candidate somatic mutations in 50 genes in the sputum of the patient (Table S2). The mutations comprised 34 non-synonymous SNVs, 10 synonymous SNVs, 3 frameshift insertion/deletion, 3 essential splice sites, 1 stopgain, as well as 2 variations without annotations and 2 noncoding RNA mutations. Two mutations were identified in each of the following three genes, TTN (p.A4284T, p.A3465T), PGM5 (p.G215S, p.I227V) and DEAF1 (p.Y300Y, p.P299L). Five out of the six mutations were nonsynonymous mutations. A total of 12 of the 50 somatic mutation-containing genes were annotated as oncogenes and tumor-suppressor genes in the CancerMine database, which were significantly enriched among all genes (p value, 1.11e-5, hypergeometric test). Non-synonymous mutations were found in two oncogenes ADAM28 and TTN, and five tumor-suppressor genes ARRDC3, MRPL48, NPAS3, EPB41L3 and RANBP2, showing much higher frequency in tumor suppressor genes (Table 1) Genes harboring candidate somatic mutations were subjected to functional clustering to analyze the potential biological roles that these mutations preferentially affect. In the biological process terms of GO analysis, they only enriched in DNA-dependent transcription function (p-value greater than 0.05).
Non-synonymous somatic mutations were enriched in the loss and gain of Thr, Ser or Tyr Gefitinib is an ATP analogue and well-known for its effective inhibition of the constitutively active  (Table 2). This significant high frequency of the loss (pvalue, 0.023, probability test) and gain (p-value, 0.089, probability test) of the three phosphorylation amino acids in the non-synonymous somatic mutations in this studied patient might explain a part of his complete response to gefitinib.
Comparison of genes containing somatic mutations identified from this study and those from a previous study We next compared the somatic mutations identified in this study with those identified in 18 pairs of Another somatic mutation-containing S100A16 encodes a member of the S100 protein which is a ubiquitously expressed calcium-binding protein of the EF-hand superfamily which is up-regulated in tumors (Marenholz, Heizmann et al. 2004, Sturchler, Cox et al. 2006). S100A16 is a prognostic marker in multiple human cancer, including lung adenocarcinomas, breast cancer, prostate cancer, colorectal cancer, oral squamous cell carcinoma (Zhou, Pan et al. 2014, Saito, Kobayashi et al. 2015, Sapkota, Bruland et al. 2015, Zhu, Xue et al. 2016, Sun, Wang et al. 2018).
The findings that the gain and loss of phosphorylation amino acid Thr, Ser and Tyr are strongly enriched in the missense somatic mutations identified from our patient are particularly interesting.
Gefitinib is an ATP analogue which effectively inhibits the constitutively active kinase activity of the EGFR mutants (Lynch, Bell et al. 2004

Availability of data and materials
All data generated or analyzed during this study are included in this published article and its supplementary information files.

Competing interests
The authors declare that they have no competing interests.

Funding
This work is also supported by ABLife (ABL2014-06012 to Yi Zhang). The funders were not involved in the study design, data collection, analysis, or manuscript writing nor in the decision to submit the manuscript for publication.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.