Looking beyond the cytogenetics in haematological malignancies: decoding the role of tandem repeats in DNA repair genes

In cancer research, one of the most significant findings was to characterize the DNA repair deficiency as carcinogenic. Amongst the various repair mechanisms, mismatch repair (MMR) and direct reversal of DNA damage systems are designated as multilevel safeguards in the human genome. Defects in these elevate the rate of mutations and results in dire consequences like cancer. Of the several molecular signatures in human genome, tandem repeats (TRs) appear at various frequencies in the exonic, intronic, and regulatory regions of the DNA. Hypervariability among these repeats in the coding and non-coding regions of the genes is well characterized for solid tumours, but its significance in haematologic malignancies remains to be explored. The purpose of our study was to elucidate the role of nucleotide repeat instability in the coding and non-coding regions of 10 different repair genes in myeloid and lymphoid cell lines compared to the control samples. We selected MMR deficient extensively studied microsatellite instable colorectal cancer (HCT116), and MMR proficient breast cancer (MCF-7) cells along with underemphasized haematologic cancer cell lines to decipher the hypermutability of tandem repeats. A statistically significant TR variation was observed for MSH2 and MSH6 genes in 4 and 3 of the 6 cell lines respectively. KG1 (AML) and Daudi (Burkitt’s lymphoma) were found to have compromised DNA repair competency with highly unstable nucleotide repeats. Taken together, the results suggest that mutable TRs in intronic and non-intronic regions of repair genes in blood cancer might have a tumorigenic role. Since this is a pilot study on cell lines, high throughput research in large cohorts can be undertaken to reveal novel diagnostic markers for unexplained blood cancer patients with normal karyotypes or otherwise with karyotypic defects.


Introduction
The genetic characterization of haematologic malignancies, for classification, diagnosis, prognosis, and therapeutic interventions, are constantly evolving due to increasing genomic variation as supplemented by high throughput data. Numerical and structural chromosomal instabilities (CINs), resulting in a 'chaotic genome' are established hallmarks of blood cancer. However, the role of tandem repeat (TR) variations implicating molecular changes in haematologic cancers are highly argued [1,2]. Normally, DNA repeats are grouped into two major classes-the tandem repeats (satellite DNA) and the interspersed repeats. While the mono-, di-, tri-, tetra-, penta-, and hexa-repeats are clustered; long, and short interspersed retrotransposable element (LINE and SINE) are scattered all over the genome. The tandem repeats are futher classified into macrosatellites and microsatellites. The microsatellites, span the genome in clusters and are more prevalent in non-coding regions than coding regions of the DNA [3]. Besides their evolutionary relevance, the repetitive sequences have gained importance in cancer research as they display hypervariability in the repeat number [4,5]. While the mutations in the coding repeats result in abnormal gene products, repeat instabilities in the dark DNA can cause genomic fragility yeilding alter reading frames or splicing sites variants [6]. Harms causes by DNA repeats varitions can be dictated by two opposing mechanisms-(1) any variation in repeat number for genes having TRs in critical positions may directly or indirectly cause functional loss.

3
The repeat number variations and corresponding shyness of function may cause onset of genomic instability which is critical for tumour evolution; (2) upon inactivation of repair mechnisms, increased heterogeneity is observed at simple repetitive DNA (e.g., mono-and di-nucleotides) in bacteria, yeast, and mammals. The association of defective mistmatch repar (MMR) and an elevated instability of DNA repeats is particularly strong in solid tumours, but less predominant in blood cancers [7].
Another single step repair system-Direct Reversal of DRD specifically acts on pyrimidine dimers and alkylated guanine residues. The O 6 -methylguanine (O 6 meG) methyltransferase (MGMT) is a suicidal enzyme that removes the methyl adducts from the 6th position of guanine and in the process inactivates itself. Mutations resulting in complete or partial loss of functional MGMT are evident in lung, gastric, breast, and hepatobiliary cancers [10]. Loss of MGMT function is usually charted to promoter hypermethylation or nucleotide transversions through mispairing between O 6 meG and thymine (T) which is also marked by the MMR machinery [11]. However, it would be interesting to explore hypervariablity in the repeat regions of MGMT gene. As evident, malfunctioned DNA repair systems pose a threat to the normal genome, giving us an opportunity to reflect on affects of tandem repeat contractions or expansions in various genes.
In cancer, the significance of repeat sequence variations has emerged in disease onset, progression, outcome, therapeutic response, and overall survival. Microsatellite instability was first identified in hereditary nonpolyposis colorectal cancer (HNPCC) and later in tumors of ovary, stomach, breast, skin, brain, endometrium, small intestine, urinary, and hepatobiliary tract for functionally impaired MMR genes [12]. Nevertheless, their role in haematologic cancers with varying degree of molecular characteristics and clinical behaviour, still remains debatable. Classically, blood cancers follow the pattern of chromosomal instability which triggers the early events of tumorigenesis conferring accumulation of mutations, aggressive phenotype, and therapeutic resistance during clonal evolution of cancer. In due course, with compromised DNA repair components, the tumours shift to molecular events including mutability of tandem repeats and loss of heterozygosity. However, here we propose an alternative hypothesis that in certain complex cases of haematologic maligancies, CINs alone may not be the prominent cause. Early events of repeat polymorphisms in the intronic and exonic regions of DNA may lead to impaired repair mechanisms causing hematopoietic disturbances followed by cancer development. A few research groups have confirmed presence of tandem repeat variation in myeloid and lymphoid cancers while others have found it to be rare or absent in blood cancers [13][14][15][16]. These incongruencies motivated us to design a pilot study to investigate tandem repeat instability in MMR and MGMT gene structures in blood cancer cell lines.

Procurement of cell lines and control samples
To study the repeat number variations, four blood-cancer cell lines-K562 of CML (Chronic Myeloid Leukaemia), KG-1 of AML (Acute Myeloid Leukaemia), Molt-4 of T-ALL (T-cell acute lymphoblastic leukaemia), and Daudi of Burkitt's lymphoma were selected. Of these, K562, Daudi, and KG-1 are profiled as MMR proficient while Molt-4 is known to show mutation, in any of the MutSα (MSH2/ MSH6) or MutLα (MLH1/PMS2) components. The MMR deficient HCT116 colorectal cancer (CRC), and proficient MCF-7 breast cancer cell lines were also used in the study. Five control DNA were used along with all the above cancer cell lines. The cytogenetic and mismatch repair status (proficient/deficient) of cell lines are provided in Supplementary Table 1. All cell lines were procured from National Centre for Cell Sciences, Pune, India except CRC cell line which was a kind courtesy of collaborating laboratory. The procured cell lines were maintained under proper culturing conditions until 80% confluency was achieved. The passage numbers, growth media, and growing conditions of the celllines are mentioned in Supplementary Table 2. The blood 1 3 was collected from 5 healthy individuals with informed consent as control samples.

Extraction and quantification of DNA
The DNA from six cell lines and five control samples (2 males and 3 females) was extracted using HiPura Mammalian Genomic DNA extraction Kit (MB-506) with slight deviation from the manufacturer's protocol. The isolated DNA was run on 0.8% Agarose gel using Tris Borate EDTA buffer and further analysed using NanoDrop spectrophotometer (Thermo Fisher Scientific).

Designing of primer for gene-sequence specific tandem repeats
Of the total 10 genes, 8 primers were designed while 2 primer seqeunces were obtained from literuature search. The markers of MSH2 (BAT26-Bethesda panel marker), and MLH1 (D3S1611) were selected from peer-reviewed literature though these have either not been explored or explored with inconsistent outcomes for haematologic cancers [17,18]. For the other genes-MSH3, MSH4, MSH5, MSH6, MLH3, PMS1, PMS2, and MGMT, repeat clusters were identified from the gene sequence (NCBI) and primers were designed (Primer3Plus-http:// www. bioin forma tics. nl/ cgi-bin/ prime r3plus/ prime r3plus. cgi) for thereof. As anticipated the location of the repeats for MSH3, MSH4, MSH5, MLH1, PMS1, and MGMT genes was in the intronic regions. The markers for MSH2, MSH6, and MLH3 were located at exon-intron junction while PMS2 was an exonic marker. The primer pairs were finalized by validating through NBLAST (https:// blast. ncbi. nlm. nih. gov/ Blast. cgi? PAGE_ TYPE= Blast Search) and in silico PCR tools (https:// genome. ucsc. edu/ cgi-bin/ hgPcr). The MSH2 and MSH5 markers were enriched with mononucleotide (A) n repeats and that of MSH3, MSH4, MLH1, MLH3, PMS1, and MGMT were dinucleotide (CA) n /(AT) n repeat markers. These repeat sequences are categorized as perfect microsatellites. While the marker of PMS2 [C(A) 6 C(A) 4 C(A) 4 C(A) 4 C(A) 13 ] was an imperfect microsatellite repeat, the marker of MSH6 [(A) 6 C(A) 6 C(T) 18 ] was a complex microsatellite repeat (Supplementary Table 3). Initially untagged primers were obtained for pilot runs followed by fluorescently tagged (HEX) primers, from Eurofins India Pvt. Ltd. The detailed information of primers is documented in Table 1.

Polymerase chain reaction (PCR)
Amplification was carried out using TaKaRa Clonetech R050A PrimeSTAR GXL DNA polymerase kit. Each experiment was carried out using 30 µl of reaction mixture [12 µl 5X GXL Buffer, 2.5 mM dNTP, 1.25 U/µl Primestar GXL DNA Polymerase and, 10 pM each of forward and reverse primer along with 50 ng DNA] for three replicates and the experiments were repeated thrice. The samples were amplified using Eppendorf Mastercycler Nexus PCR model for 35 cycles each. The initial denaturation temperature and final extension temperature were kept as 96 °C for 6 min and 72 °C for 7 min respectively while annealing tempeatures were adjusted as mentioned in Table 1. Amplified products were analysed using agarose gel electrophoresis and further subjected to fragment analysis.

Capillary electrophoresis-based fragment analysis
The PCR products were pooled from triplicate samples for fragment analysis that was carried out using Applied Biosystems (ABI) 3730 XL genetic analyser. The reaction mixture consisted of PCR product (0.5 µl), Liz-500 size standard (orange colour dye) (0.5 µl), and HI-DI formamide (9 µl). The mix was denatured at 95 °C for 5 min and snap-frozen for 2-3 min before loading it on the sample tray. The fluorescence intensity was recorded as function of wavelength and time and the data was visualized using GeneMapper software (Thermo fisher scientific).

Statistical analysis
The two sample F-test was performed using MATLAB software (version R2018a) to detect the significance of left or right shift in the peaks of ten repeat-rich markers. The null hypothesis was accepted if the output generated the hypothesis value as h = 0. Otherwise, for h = 1, the alternative hypothesis was considered.

Analysis of splice sites for individual markers
The constitutive or cryptic splice sites of ten repeat rich markers were analysed using ASSP (Alternative Splice Site Prediction) web-tool (http:// wangc omput ing. com/ assp/ index. html). It helped to predict possible shifts in alternative splicing due to hypermutability, which is otherwise critical for regulation of gene expression. The tool provided information regarding frequency of GC nucleotide and presence of constitutive or cryptic splice site along with a predictive score.

Profile of tandem repeat instability in cancer cell lines
The PCR amplified products were run on agarose gel to confirm the amplification of repeat-rich regions prior to Table 1 Details of primer pairs along with position, annealing temperature, number of repeats, product and allele sizes for ten repair genes  Fig. 1). The instability profiles of six cell lines for 10 repeat-rich markers were compared with five healthy controls. This was detected by the left or right shift in the electropherogram, generated by GeneMapper software indicating contraction or expansion of nucleotide repeats respectively. The markers at different loci in PCR amplified product exhibited fine peak resolution that helped to calculate the range of allele size. Additionally, no stutter peaks hindered the recognition between the homozygotes and heterozygotes. The product size of each marker is non-comparable with respective allelic size in homozygous and heterozygous conditions (Supplementary Figs. 2-11). However, for selective repair genes, low to high alterations in perfect, imperfect, and complex repeat sequences were evident in six cell lines. This established possible changes in tandem repeats of mismatch repair and MGMT genes. The graphical representation of loss and/ or gain of nucleotides is depicted in Fig. 1.  2D). • MSH6: All the controls and cell lines were homozygous for MSH6 marker which was located at intron-exon junction. In controls, the allele size ranged between 216 and 217 bp. In CML, AML, and CRC cell lines, maximum contraction in complex repeat tract accompanied by poly-A and T nucleotides were observed for ~ 56 bp, ~ 66 bp, and ~ 55 bp respectively. The cell lines of T-ALL, Burkitt lymphoma, and breast cancer were microsatellite stable. The alterations in repeat sequences were statistically significant (p < 0.001) (Fig. 2E).  Here again, the range of heterozygous alleles were considerably different for one control. Contraction of (AT) repeats were observed in homozygous Molt-4 of T-ALL for ~ 6 bp, and Daudi of Burkitt lymphoma for ~ 13 bp. The KG-1 of AML showed expansion of repeats for ~ 8 bp. Both loss and gain of repeats were observed in heterozygous HCT116 of CRC for ~ 9 bp and ~ 10 bp in allele 1 and 2 respectively which is atypical from the control which has a shifted range rather than gain and loss. The K562 of CML, and MCF-7 of breast cancer were microsatellite stable. The changes observed were not statistically significant (Fig. 3C). • PMS2: The exonic marker of PMS2 was accompanied by imperfect repetitive sequences of poly-A nucleotides. All the cell lines and controls were homozygous for this marker. The control samples showed fragment at 218 bp while a shift of single bp repeat was observed only in three cell lines-K562, KG-1, and MCF-7, which may have no statistical significance for the exonic position of the marker. The Daudi, Molt-4, and HCT116 did not confer instability (Fig. 3D).

Repeat variations in MGMT gene of direct reversal of DNA damage
• MGMT: For intronic dinucleotide MGMT marker, three homozygous controls showed fragment at 214 bp and 216 bp, while the allele size of two heterozygous controls was observed at 206-214/216 bp for allele 1 and 2. The cell lines were homozygous for this marker except Daudi which was heterozygous with an approximate 6 bp gains in dinucleotide repeats. The K562, and Molt-4 did not show CA-repeat variations. Contraction in repeat sequence was observed in cell lines MCF-7, KG-1, and HCT116 for approximately 1, 2 and 8 basepairs respectively. (Fig. 3E).
The major shortcomings here was the inability to define the range of heterozygous alleles in the control population due to small sample size. This drawback will be remedied in our future study using larger sample numbers for controls and patients, scanning whole gene regions using high throughput technique for these 10 genes. However, to the credit of this study, haematologic cell lines KG-1, and Daudi were distinctly unstable for more than 5 out of 10 markers analysed. Polymorphic repeat were observed in HCT116 and MCF-7, as consistent with previous findings [19,20]. The total percentage accounting for hypermutability from all markers in four blood cancer cell lines was -100% in KG-1, 70% in Daudi, 50% in K562, and 30% in Molt-4 (Supplementary Table 4). The highest tandem repeat fluctuations were recorded for MSH3 marker as observed in all four blood cancer cell lines. This was followed by MSH2 and

Prediction of splice sites in individual marker
Most of repeat rich markers were found to carry cryptic donors and acceptors. Activation of these cryptic splice sites due to change in repeat number may cause unfavourable consequences. Likewise, classical splice sites may also get disrupted due to repeat variations in constitutive regions. Here, the 'intronic GC' represents the frequency with which GC nucleotides occurs around the splice site and is closely associated with site recognition and usage [21]. The confidence for a constitutive or cryptic splice site always ranges between 0.000 < score < 1.000, while for an unclassified splice site, the score is 0.000 (Supplementary Table 5). The BAT26 marker of MSH2 was predicted to carry a cryptic donor with a high confidence of 0.885 and an unclassified splice acceptor. For MSH3 marker, one cryptic donor with score of 0.852 was identified. In MSH4 marker, four cryptic donor (score: 0.258-0.867) and one acceptor (score: 0.727) were recognized. In MSH5, one cryptic splice donor with score 0.974, and acceptor with score 0.267 were found. In MSH6 marker, one cryptic donor (score: 0.944) was predicted. The MLH1 marker was enriched with two cryptic donor (score: 0.789-0.900) and four cryptic acceptors (score: 0.523-0.955). In MLH3 marker, one constitutive donor with score 0.940, and one cryptic acceptor with score 0.185 were identified. For PMS1 marker, only one constitutive donor with score 0.274 was predicted. Since the PMS2 marker is exonic, the ASSP did not generate any predictive result. In MGMT marker, one constitutive donor (score: 0.451) and two each of cryptic donors (score: 0.764-0.926) and acceptors (score: 0.437-0.521) were recognized. In view of the confidence of ASSP prediction, the cryptic donors in MSH2 and MSH6 markers were identified with 88.5% and 94% confidence which ascertains the definite presence of splice sites and correlates to repeat displacement. The presence of cryptic donor in MSH3, MSH5 and constitutive donor in MLH3 was identified with 83%, 97%, and 94% confidence respectively. The splice sites harbouring donors and acceptors for MLH1 was identified with high confidence and that of MSH4, MGMT were recognized with moderate confidence. In PMS1, the confidence was found to be lowest which is 27%.

Discussion
A preliminary investigation was carried out based on the premise that, microsatellite repeat variations are crucial to haematologic malignancies and that sometimes cytogenetic anomalies may not take precedence especially in atypical cases of blood cancer. The hotspots in short repetitive regions were decoded from the vastness of gene regions particularly, the dark DNA. Since the cells were maintained at low passage, additional changes in their cellular and molecular characteristics can be ruled out. Various forms of genomic instabilities are documented in cancers of the blood and other tissues. Tumors driven by defective DNA repair exhibit hypervariability in length of tandem repeats that results in functional disturbances. The diversity in repeat sequences of satellite DNA was higher for haematologic KG-1 (10 markers) and Daudi (7 markers) cell lines along with HCT116 (7 markers) and MCF-7 (5 markers). Such outcomes confirm susceptibility to repair deficiency in cancer, irrespective to its tissue of origin. K562, and Molt-4 manifested instability in 5 and 3 repair genes respectively. The degree of instability was statistically significant for markers of MSH2, and MSH6. Interestingly, though the BAT26 in MSH2 is known for its instability in solid tumors, MSH6 a novel marker, showed a similar trend in most cell lines here. The Table 2 consolidates the position, repeats, zygosity, observed allele sizes, and loss/gain in repeat sequences of ten markers in control samples and cell lines.
A large contraction of approximate 16 to 18 bp was noted in poly-A tract of BAT26 for AML (~ 16 bp), and CRC (~ 18 bp) cells. This marker spans the exon-intron junction of MSH2 and accompanies a cryptic donor and unclassified splice acceptor as predicted by ASSP tool. Hence, this loss of nucleotide repeat at 5th exon-intron junction may change the spliceosome binding at the donor-splice site and simultaneously affect the conserved reading frame. As a part of Bethesda Panel, BAT26 is confirmed to be replication error positive (RER + ) and it impairs the functional rightness of MMR in colorectal and breast cancers [22][23][24]. Opposing opinions on the impact of hypervariability in this marker exist for blood cancers. Few investigations suggest it to be insensitive for microsatellite instability in AML [16,25] while others propose MSI as a distinct feature especially in de novo non-mutant FLT3 AML patients [26]. Here, FLT3 normal cell line KG-1 here showed a very much similar trend probably downregulating MSH2. We suggest that BAT26 could serve as a potential marker in primary AML cases with unaltered FLT3 status and complex cytogenetics.
There is no literature available citing tandem repeat variations in MSH3 which inspired us to design a primer    within the deep intronic AT-rich region of this gene. A significant loss of dinucleotide tract, in AML and Burkitt's lymphoma cell lines (16 and 18 bp respectively) was seen while, other cell lines showed minor loss and/or, gain which may have no effect on gene expression given the deep intronic location of the marker. The ASSP tool predicted presence of cryptic donor upstream to the ATrich regions which though farfetched, may be recognized by spliceosome machinery generating a pseudo exon into mature transcript. Reduced expression of MSH3 was reported in AML earlier while deficiency of this protein made mice models susceptible to tumour development [27,28]. The role of MSH3 mutations and reduced expression is reiterated in haematologic malignancies like the adult T-cell leukaemia and childhood ALL [29][30][31]. The dimerization of MSH3 with MSH2 (MutSβ) is specific for repair of INDELs, and intronic sequence perturbations may have overall negative outcome on mature mRNA, which calls for comprehensive research of MSH3 repeat variations in haematologic malignancies. MSH4 gene is not studied extensively for tandem repeat instability in solid tumors or blood cancers. Here too, like MSH3, we designed a deep intronic AT rich marker which showed no notable variation in repetitive sequences.The in silico analysis predicted that this approximate 30 nucleotide AT-rich region was accompanied by cryptic donors and acceptors. We observed the cell line Daudi to be moderately unstable for this gene. Though not likely, inefficient splicing might occur due to contraction and expansion of nucleotide repeats [32]. Yang and colleagues (2020) identified somatic mutations of MSH4 to be associated with microsatellite instability in metastatic bladder cancer [33] and proposed inclusion of this gene in IHC panel to refine the MSI testing. However, for the novel marker here, we suggest a comprehensive screening to generate valuable information on defective MMR that can surpass the prognostic barriers of haematologic cancers and improve the survival of patients.
The borderline instability in intronic poly-A sequence for K562, KG-1, Daudi, and MCF-7, as confirmed by MSH5 marker, may not impair the mRNA transcript in cancer, though charting it for meiotic recombination, crossover, and immunoglobulin diversity may require an entirely novel study. However, the SNPs have been reported for this gene in blood cancers [34]. The forward and reverse primers flanking this poly-A tract was predicted to carry a cryptic splice site. Here, though the magnitude of variation is not very high; yet minute changes in nucleotide repeats may interfere with splice sites causing abnormal splicing and unusual transcripts. The IgA deficient and CVID (Common Variable Immune Deficiency) patients harbour mutant MSH5 allele with greater than normal risk of developing non-Hodgkin lymphoma or cancers of the immune system [35,36]. Detailed research is warranted for MSH5 to investigate the possible cause-effect relationship between TR hypervariability and cancer in immunodeficient individuals.
Repeat instability and subsequent loss of MSH6 expression was observed in colorectal and endometrial carcinoma [37]. The primer designed involved intronic repeats flanked by sequences of exon 10. The ASSP predicted presence of cryptic donor in this region. Here, we report large deletion of complex repeat sequences in myeloid and colorectal cancer cell lines at intron-exon boundaries. Through fragment analysis results, it can be speculated that the deletion did not affect the repetitive nucleotides but the downstream exon. Due to the location of the marker at intron-exon junction, the prominent loss of poly-A or poly-T tracts might alter the exonic splice donor or disrupt the reading frame through exon skipping. Since, the strength of splice sites is dependent on the adjacent sequences, deletion can not only activate cryptic splice site but also deregulate the pattern of pre-mRNA splicing. Overall, the marker turns out to be important in solid tumors and blood cancer provided studied in larger sample size. The role of secondary MMR components viz., MSH3, MLH3, and PMS1 in compensating a malfunctioned MSH6 is often argued. Under such scenario, comprehending the hypermutability of MSH6 in evolution of haematologic cancers cannot be ruled out to eliminate emergence of MSI.
The MLH1 intronic marker D3S1611 showed prominent shift of CA-rich peaks in KG-1 of AML, and low to moderate changes in Daudi of Burkitt's lymphoma, and HCT116 of CRC. This marker was earlier evaluated for MSI in cholangiocarcinoma wherein only one patient was microsatellite instable at D3S1611 loci. Later, the hypervariability in this marker was identified to influence MLH1 expression in solid tumors [38][39][40]. Here, we report the consequence of tandem repeat instability might deregulate MLH1 functions in myeloid and lymphoid malignancies. Being a central component of downstream processes of repair, loss of MLH1 collapses the MMR activity [41,42]. It also carries cryptic donor and acceptor splice sites. The dinucleotide CA repeats are involved in regulation of splicing and can activate the 5′ splice sites. Hence, contraction and expansion of deep intronic repeats might abrogate mRNA splicing. We suggest that D3S1611 marker should be investigated further to evaluate the biological and clinical risk factors of lymphoid malignancies in correlation to damage repair and immunosuppression.
There was marginal loss and gain of (AT) repeats in KG-1 and MCF-7 cell lines respectively for MLH3 marker. In humans, the activity of MutLγ involving MLH1 and MLH3 is minimal usually with functions in strand incision and endonuclease activity. Earlier a single base pair deletion in exonic mono-A repeat was identified in microsatellite instable CRC patients [43]. However, the mutator effect in MLH3 was found to be milder than other MMR components. Mutations in this gene is majorly associated with Lynch syndrome [44] but rarely correlates to familial colorectal cancer. This observation was in accordance with our result, as here the CRC cell line HCT116 did not show any alterations across the repeat length. Somatic mutations of MLH3 are often found in microsatellite instable colon, rectal, and endometrial cancers [8]. A constitutive donor was identified at the immediate end of exon 5. This may implicate that variation in (AT) 18 might relocate the reading frame of MLH3 gene impacting the splice site selection. Surprisingly, there is no confirmatory independent report on role of MLH3 TR variation in blood cancers and being an integral part of mismatch repair system, there is rationale to consider MLH3 for an extensive analysis in myeloid and lymphoid malignancies.
The intronic AT rich marker of PMS1 gene showed low to moderate contraction and expansion of the repetitive tract in-Daudi of Burkitt lymphoma, KG-1 of AML, Molt-4 of T-ALL, and HCT116 of CRC. The region was identified to carry a constitutive donor. Hence, the repeat hypervariability might result in retention of mutated intron or impede early phase of transcriptional elongation. Loss of expression of PMS1 was observed in acute T-cell leukaemia cases [29]. Absence of PMS1 expression in ovarian cancer only occurs in absence of both MLH1 and MSH2 [45]. A mutant PMS1, alone or in association with other MMR genes, causes type 3 Lynch syndrome. The TR specific hypermutability of this gene has not been deciphered in blood cancers and the present study can serve as a primary setup to initiate a drive towards research on PMS1 in haematologic malignancies.
The exonic PMS2 marker showed 1 bp deletion in CML (K562), AML (KG-1), and breast cancer (MCF-7) cell lines. PMS2 in association with MLH1, edits the erroneous nucleotides and synthesizes error-free DNA chain at the site of mismatch. The exon 15 of PMS2 was identified as a pseudogene [46] wherein the current novel marker was located. Besides being a 'silent relic', many pseudogenes exhibit tissue-specific activation as they also regulate oncogenes and Tumour suppressor genes (TSGs) by acting as 'microRNA decoy' [47]. Minimal deregulation of this region mandates more research to decipher the extent of their involvement in blood cancer progression.
The degree of variability in intronic CA rich elements of MGMT was grossly insignificant. Disruption in pre-mRNA splicing could be a possibility due to activation of cryptic donors and acceptors adjacent to the dinucleotide repeats. Only 4% of somatic errors are responsible for loss of expression in MGMT, while promoter hypermethylation is most frequent phenomenon for the inactivation of MGMT, especially in MSI-low CRC related gene silencing [48,49]. Here, we propose that large number of haematologic cancer samples should be studied for recurrence of tandem repeat length polymorphisms. Despite being low grade such repeats might set a novel foundation to outline a distinct molecular pathway in haematologic malignancies.
The ASSP analysis predicted percentage frequency of GC nucleotides in nine of ten repeat-rich markers (excluding PMS2). The GCs were found to be involved in splice-site recognition under normal condition. However, any contraction or expansion in intronic or exonic repeat regions lowers the capacity of splicing at the actual site. Such misrecognition can develop an abnormal transcript and translate into unwanted proteins [50]. The in silico splice site analysis also conferred presence of cryptic donors and acceptors. These, in general, are not utilized by spliceosome but upon mutation the cryptic splice site may get activated. Thus, large repeat expansion or deletion in the intronic regions as well as any change in the intron-exon junction, regulatory or exonic region might correlate with abnormal mRNA regulation. The confidence with which the splice donors and acceptors were predicted for individual marker here, projected possible abnormal transcript formation ultimately affecting the repair mechanism in the genome. Though there are limitations in this study, owing to unavailability of control range for designed markers; nonetheless, the fact that three out of four blood cancer cell lines had repeat variations for most markers here, proves that genome stability was compromised.
Based on the research evidence it can be theorised that MMR deficiency may not only contribute to cancer onset in solid tumors but may also alter the landscape of haematologic malignancies by elevating the mutational rate and clonal evolution of cancer cells. The magnitude with which the repair genes showed intronic, exonic or regulatory TR fluctuations here, justifies misplaced splice sites and formation of irregular protein isoforms that undeniably change the inter and intra-functional networks to establish the disease state. While the MSI profiling and use of Bethesda markers are practiced routinely in diagnostics of solid tumors, the same approach needs to be contemplated for selective MMR deficient or therapy resistant blood cancer cases, essentially where the cytogenetics fails. In 2017, the FDA (US Food and Drug Administration) approved immune checkpoint blockade (ICB) therapy and use of Pembrolizumab for all MMR deficient or MSI-high solid tumors. Furthermore, three MMR inhibitors-methotrexate, FdCyd also known as 5-Fluoro-2′-deoxycytidine (to restore functionality), and Polγ (to create synthetic lethality in cells deficient of MSH2/ MLH1) along with some anti-PD-1 (Programmed death receptor) monoclonal antibodies such as-Avelumab, Durvalumab, Atezolizumab, and Nivolumab, alone or in combination are currently being investigated [51,52]. We propose that the use of these medications can not only combat MMR infidelity, but also provide a survival advantage to the haematologic malignant patients as well. In future, drug related TRs variations in DDR genes including MGMT can also be investigated. Some facts of AML and MSI + phenotypes 1 3 which we came a across while pondering over this query. DNA methyltransferase or histone deacetylase inhibitors are in fact a part of the therapy in AML. MSI + phenotype can occur in up to 50% of therapy related AML which is attributed to a loss of immunosurveillance resulting from transient or prolonged immunosuppression especially in cases of leukaemia. Such mishaps in the genome occur post stem cell transplants or following chemo-or radio-therapy. It would be interesting to understand if repeat variations in DDR genes are linked to treatment resistance in future.