A Novel Stop-Gain Mutation in MSH2 Gene Among a Persian Family Fullling Classic Amsterdam Criteria for Lynch Syndrome

Purpose Lynch syndrome is the most common hereditary cancer syndromes due to a germline mutation in one of the mismatch-repair (MMR) genes. It results in early-onset colorectal cancer (CRC) and other Lynch-associated cancers in an autosomal dominant pattern. In this article, a new pathogenic variant in a Persian family with familial CRCs and positive Amsterdam II criteria has been described. Methods IHC-MMRs was done on tissue sections from tumor and its adjacent healthy tissue of the proband. Microsatellite instability (MSI) testing was also performed on DNA extracted from tumor and healthy tissue using Promega kit. Next Generation Sequencing (NGS) was nally done on genomic DNA of the proband using a 12-gene-panel including MMR genes. Variant ltering and prioritization were done using bioinformatics tools. Co-segregation analysis was used to evaluate the explored pathogenic variant.


Introduction
Colorectal cancer (CRC) is one of the most common cancers throughout the world which it's prevalence is rapidly increasing among the developing countries as Iran, likely due to the westernization of general life style [1]. CRC is now second and fourth common cancer among Iranian women and men, respectively.
Nevertheless, no systematic program has been set up so far for general screening and early detection of this common healthcare problem in Iran (2)(3)(4).
Lynch syndrome (LS), the most common hereditary cancer predisposing syndrome, accounts for 3-5% of the CRC patients. It is an autosomal dominant disorder in which the lifetime cancer risk of carriers increases up to 80%, most of which occurs before 50 years ages [5,6]. Although CRC and endometrial cancers include most of the LS-associated cancers, other organs such as stomach, small bowel, upper uroepithelial and, hepatobiliary tract, breast, ovary, brain, and even skin could be primarily involved in carriers [7][8][9]. Molecular pathogenesis of LS is germline mutation in at least one of the four DNA mismatch repair (MMR) genes including MLH1, MSH2, PMS2, and MSH6. Moreover, the germline deletion of 3´ end of EPCAM, a gene located immediately upstream of MSH2, has been recently introduced as a cause of epigenetic silencing of MSH2 leading to LS [10,11]. MMR de ciency causes a characteristic molecular phenotype in which DNA microsatellites, short tandem repeats (STR) with 1-6 nucleotides repeating units, show instability. It is called, "Microsatellite Instability" (MSI) which is routinely used as a surrogate marker for MMR de ciency by a genetic test entitled: MSI testing [12,13]. MMR de ciency could be also evaluated by immunohistochemical (IHC) staining of tumor sections with MMR proteins [14]. LS would be con rmed if at least a germline pathogenic variant is found in MMR genes. Otherwise, Lynch-like syndrome is suggested for the patient with MMR de cient tumor with no identi ed germline mutation in MMR genes [10,15].
Although a few studies have been launched on epidemiological, clinicopathological and molecular aspects of CRC among Iranian populations, this limited data is suggestive for a different molecular feature of the disease in Iranian ethnicities compared to the populations of other countries [4,16]. We report in this article a new pathogenic variant of MSH2 gene which was found in a Persian family with LS.

Subjects and clinical samples
The subjects of this study include four members of a family with Amsterdam II criteria [17]. These criteria are usually used to screen at-risk families for LS including: Three or more affected members with histologically con rmed CRC or other LS-associated cancers such as endometrium, small intestine, stomach, urinary tract, skin, brain, and breast, one of whom being a rst-degree relative of the other two; familial adenomatous polyposis should be excluded; two or more successive generations are affected, and at least one affected members was diagnosed before 50 years old. In our study, there were at least ve cancer-affected members in three successive generations of the family of whom three patients were affected by CRC. (Fig. 1) The index case was a 40-year-old woman (38 years at diagnosis) whose tumor was located in sigmoid.
We used genomic DNA extracted from formalin-xed para n-embedded (FFPE) tissue for MSI testing and IHC staining. The genomic DNA extracted from peripheral blood was also used for the targeted Next Generation Sequencing (NGS). The raw sequencing reads obtained from NGS were ltered to remove the low-quality reads with more than 10% of uncalled bases, appending the N bases at the end of reads, and removing every chimera with more than 15 bases matched to the primer sequences. Then the quali ed reads were mapped to human genome (hg19) by BWA v0.7.12 [19]. Moreover, the single nucleotide variants (SNVs) and short insertions/deletions (indels) were called using Genome Analysis Toolkit (GATK v 3.4-46) [20]. The ltration of SNVs and indels was performed according to the GATK best practice pipeline. For con rmation of that all of the intended genes have been captured by NGS, the Integrative Genomics Viewer (IGV 2.6x) was used for analysis of BAM les. Variant annotation regarding to the functional effects was ful lled by ANNOVAR and snpEff software [21,22]. All of the variant ltered against the Minor Allele Frequency (MAF) of 1% to rule out the potential polymorphisms according to the annotation that were based on the 1000 Genome [23], The Genome Aggregation Database (gnomAD) [24], and The Exome Aggregation Consortium (ExAC) population databases [25]. Homozygous variants were also excluded because the inheritance pattern of the lynch syndrome is dominant. Since the loss of function (LOF) of the genes is the main reason of the hereditary cancer like LS [26], our investigation in the variants started from those variants (frameshifts, stop codon, initiation codon, and critical splice site regions) resulting in the null proteins. The prediction of splice site variants was made with Human Splicing Finder [27]. Further investigations, if needed, are preceded with missense variants using computational tools including mutation taster [28], mutation assessor [29], SIFT [30], and polyphen-2 [31] for determining the damaging probability of them. Finally, the identi ed variants pathogenicity can be interpreted according to ACMG 2015 standards and guidelines for the interpretation of sequence variants [32].

Co-segregation Analysis by PCR-Sanger Sequencing
Rather than the proband, her three rst degree members were also investigated for the existence of the identi ed pathogenic variant. They include her father, as an affected member with CRC, and two her healthy sisters. Co-segregation analysis was carried out by PCR-Sanger sequencing of genomic DNA extracted from peripheral blood of the cases. The forward (GCATGAAGTCCAGCTAATACAG) and reverse (GCTATTAAAGTGTCTCAAACCA) primers were designed to amplify the DNA fragment that included the identi ed variant. The PCR program was set up as 95°C for 5 min (1 cycle), 94°C for 30 s, 58°C for 30 s, and 72°C for 20 s (30 cycles), and the nal extension at 72°C for 1 min. The PCR products were analyzed in 2% gel agarose electrophoresis. After PCR reaction, the amplicon was run on 2 % agarose gel to con rm the size 357 bp. Sanger sequencing was done by Applied Biosystems 3130xl Genetic Analyzer, 16-capillary electrophoresis.

Results
The under-studied family included at least ve con rmed cancer patients within three successive generations: three colorectal cancers, one prostate, and one breast cancer. (Fig. 1) The proband (III-3) was a 38-year woman at diagnosis affected with CRC located in the sigmoid colon. The pathologic diagnosis had been reported as, "well differentiated adenocarcinoma" in B stage according to the modi ed Astler-Coller classi cation about three months after onset of clinical symptoms in Nov 2010, including chronic constipation and hematochezia. The proband's father (II-4) had been also affected by CRC in rectosigmoid colon in 66 years old, two years before her daughter.
In MSI testing and IHC staining of MMR proteins on colorectal tumor and normal tissue of the proband, MSI-H and IHC-MMR absent phenotype in both MSH2 and MSH6 proteins were concluded. Totally 16,482 variants have been called, and analysis of them proceeded according to method that we have described. Altogether, one null and six synonymous and missense variants have been identi ed. Figure 2 Finally, a heterozygous clinically signi cant variant was identi ed in MSH2 gene that a substitution of A to T results in a stop codon on the rst exon of the gene transcript (MSH2 (NM_000251.2): c.364A > T, rs374127044) and the variant was not found neither in any population nor the disease databases like ClinVar and OMIM.
In co-segregation analysis, altogether two affected and two unaffected members of the pedigree were assessed to evaluate c.364A > T variant in MSH2 gene by PCR-Sanger Sequencing. This variant was found in both affected members (II4 and III3) and also one of the two healthy sisters of the proband (III4). Figure 3 Discussion In the present study, a familial cancer pedigree with Persian origin was ascertained. Three of ve cancer patients were affected by CRC. Two other cancer patients had breast or prostate cancers. We were able to identify the molecular etiology taking advantage of NGS technology.
Based on a literature review of clinicopathologic studies, MSH2 germ-line mutations present more extracolonic malignancies in comparison to other MMR genes [33,34]. Although breast and prostate cancers have not considered as Lynch-associated cancers in some studies, there are some lines of evidence suggesting that MMR de ciency could increase the incidence of these cancers in MMR de cient carriers (24)(25)(26)(27). We have recently evaluated breast and prostate cancers as the rst and the third frequent extracolonic cancers among Iranian MMR de cient families in a case series study [39].
According to IHC-MMR results on the tumor FFPE tissue of the proband, both MSH2 and MSH6 proteins were suggested to be defective. Therefore, we expected a pathogenic variant to be found in MSH2 gene through NGS identi ed variant analysis. MSH2 is one of the main MMR protein in which the accessory protein is MSH6. According to the Human Gene Mutation Database [40], a valid well-known database within human genes and mutations, by Jun 30, 2019, altogether 1008 pathogenic variants had been registered for MSH2 gene of which 32.9% were missense/nonsense variants (Fig. 4 ) (35). The identi ed heterozygous variant (MSH2 (NM_000251.2): c.364A > T, rs374127044) was located in the rst exon of MSH2 gene results in a stop codon. It is the only identi ed pathogenic variant that compatible with IHC-MMR ndings. Given the type of the variant (null allele) and simultaneous defect of both MSH2 and MSH6 proteins in IHC staining, this variant is most likely responsible for MMR de cient phenotype in the family. On the other hand, according to the ACMG 2015 guidelines and standards for interpretation of sequence variants [32], the identi ed variant matches with the criteria for categorizing as pathogenic for cause of lynch syndrome. The variant was nonsense, and it could result in the loss of function of MSH6 gene product, matches with the mechanism of LS, a very strong evidence for pathogenicity (PVS1) [26]. Moreover, this variant has not been found in any population databases including 1000 genome, ExAC, and gnomAD in which give a moderate evidence for being pathogenic. The variant was also found in both affected CRC patients in the family, while one of the rst-degree healthy members of the proband did not present this variant in co-segregation analysis. (Fig. 3

Declarations
Funding: This article has been concluded of a PhD project supported by Shahrekord University of Medical Sciences.
Con icts of interest/Competing interests: The authors declare no con icts/competing interests.
Availability of data and material: All data related to the results are available in the data repository of the PI.
Code availability: Not applicable Ethics approval: This study has been ethically approved by research deputy of Shahrekord University of Medical Sciences.

Figure 1
The pedigree of a Persian family with familial cancer in three successive generations.   The electropherogram of DNA sequence illustrating of g.364A>T variant of MSH2 gene in two affected (II4 and III3) and one healthy members (III4) of the pedigree