Deregulation of MMR-Related Pathways and Anticancer Potential of Curcuma Derivatives– A Computational Approach

Plant derived products have steadily gained momentum in treatment of cancer over the past decades. Curcuma and its derivatives, in particular, have diverse medicinal properties including anticancer potential with proven safety as supported by numerous in vivo and in vitro studies. A defective Mis-Match Repair (MMR) is implicated in solid tumors but its role in haematologic malignancies is not keenly studied and the current literature suggests that it is limited. Nonetheless, there are multiple pathways interjecting the mismatch repair proteins in haematologic cancers that may have a direct or indirect implication in progression of the disease. Here, through computational analysis, we target proteins that are involved in rewiring of multiple signaling cascades via altered expression in cancer using various curcuma derivatives (Curcuma longa L. and Curcuma caesia Roxb.) which in turn, profoundly controls MMR protein function. These biomolecules were screened to identify their ecacy on selected targets (in blood-related cancers); aberrations of which adversely impacted mismatch repair machinery. The study revealed that of the 536 compounds screened, six of them may have the potential to regulate the expression of identied targets and thus revive the MMR function preventing genomic instability. These results reveal that, there may be potential plant derived biomolecules that may have anticancer properties against the tumors driven by deregulated MMR-pathways.


Introduction
Cancer is a multistep process that involves overexpression of oncogenes and silencing of tumor suppressors through mutations or epimutations. Mutations in cancer, lead to uncontrolled cellular proliferation and evasion of apoptosis as directed by 'driver mutations' in solid tumors and haematologic malignancies [1]. Loss of function of the TSGs and accompanied increase in the oncogenic expression rewire the signaling cascades that promote 'malignant phenotype'. In order to provide an effective treatment that deals with such perturbations, the 'functional nodes of malformed network' needs to be identi ed and restrained [1,2].
The genomic DNA is under continuous stress due to endogenous and/ or exogenous toxic insults, for which multiple DNA repair mechanisms exists viz., excision repair (base excision repair-BER; nucleotide excision repair-NER), mismatch repair (MMR), and double stranded break repair (homologues recombination-HR and non-homologous end joining-NHEJ) [3]. These protect the damage to genetic material and formation of abnormal cells that may become immortalized instead of going through senescence. The Mis-Match Repair (MMR) system is one such mechanism involving 9 genes-MSH2-6, MLH1 and 3, and PMS1-2 that form heterodimeric protein complexes which help in recognition and repair of mis-incorporations and mis-alignments [3,4]. Approximately 15% of all primary tumors exhibit MMR de ciency, which can directly impact the DNA leading to malignant transformation [4]. Stoklosa et al. (2008) found that increased level of reactive oxygen species (ROS) resulted in accumulation of DNA lesions in BCR-ABL positive CML (Chronic Myeloid Leukaemia) cells which was associated with inhibition of MMR functions leading to increased genomic instability [5]. Mutations in MMR genes viz. MSH2 and MLH1 and promoter hypermethylation of MLH1 correlates with loss of function of mismatch repair and Table 1 demonstrates the molecular properties of all the 30 compounds. The octanol/water partition coe cient indicated by logP showed 27curcuma compounds were in the range of 0. 66-3.38; except 3 compounds (Compound-9, 10, and 46 of C. Longa L.), for which the value of logP was >5 thus not satisfying the Lipinski rule (Rule of ve: -2≤logP≤5). Hence the other compounds, falling in acceptable range were lipophilic, which is a major descriptor to understand the absorption, distribution, transport, and impact of biomolecules in physiological systems. The molecular weight of 28 compounds was found to be in an acceptable range viz. 200≤MW≤500 (Rule of ve). The compounds with low molecular weight tend to absorb well, and hence are more suited as pharmaceutical products. However, 2 curcuma compounds viz. α-pinene, and Compound 1d (C. longa L.) violated this parameter (<200g/mol) due to very low molecular weight. Though such low molecular weight compounds, also known as 'fragments', are currently being screened for druggability, here we have dismissed them. The number of H-bond donors (≤5) and acceptors (≤10) for all 30 compounds were ranged between 0-4 and 0-6 respectively, which in turn satis ed the Lipinski's rule of ve. The topological polar surface area (TPSA<140Å), which de nes the 'relative propensity for polar interactions' of target proteins with ligands was least and maximum for-α-pinene (0Å), and Compound 5 (115.05Å) of C. longa L. respectively covering all the compounds here. The minimum and maximum number of rotatable bonds was found to be 0 and 9 respectively for all the 30 compounds, depicting molecular exibility of these compounds allowing possible favourable interaction with proteins. Table 2 demonstrates the bioactivity of 30 compounds in terms of GPCR ligand, ion channel modulator, kinase-protein-enzyme inhibitors, and nuclear receptor ligand. The C. longa L. Compounds-64, α-pinene, acsjm5, acsjm6, 1d, and C. caesia Roxb. Compounds 84 and 103 didn't show any promising bioactivity, hence were disquali ed from the study.
The Table 4 demonstrates the chemical formula and IUPAC names of these six compounds. Amongst these, the Compounds 3, 88, and 91 are natural compounds present in C. caesia Roxb., and Bisacurone is a non-curcuminoid found in the oil of C. longa L. Earlier investigation identi ed three natural compounds of black turmeric derived from hexane rhizome extract [15,16]. EF31 and UBS109, on the other, are two monocarbonyl derivatives of C. longa L.
The molecular properties of these 6 compounds are tabulated in Table 5. Most of the drugs in their active form are lipophilic since they are transported through the cell membrane via diffusion and not by specialized transport systems. The logP value for all the six compounds ranged between 0.66 and 2.79 (-7.0≤logP≤6.0), which indicated the compounds to be lipophilic in nature. The lipophilic drugs tend to absorb more and excrete less giving a higher pharmacologic half-life to such molecules. A water soluble drug molecule facilitates delivery of its active ingredient in su cient quantity but tend to excrete at a higher rate. The Estimated SOLubility (ESOL≤6) as predicted by logS value indicated the compounds to be poorly soluble in water. The logS for the six compounds were: The bioactivity scores of 6 compounds are enlisted in Table 6. The nuclear receptors are highly conserved transcription factors containing DNA binding and ligand binding domains while the GPCRs bind and respond to distinct extracellular ligands [17]. Binding of ligand allows the receptor to trigger multiple intracellular signaling cascades via speci c ligand bound receptor conformation [18]. The Compound 3, EF31, and UBS109 satis ed GPCR ligand criteria. Therefore, binding of these ligands with GPCR might activate the ow of signal via modulating the downstream effectors. Besides this, the Compounds 3, 88, 91, and Bisacurone satis ed 'nuclear receptor ligand' criteria, which means these compounds can possibly mediate transcriptional regulation of genes involved in numerous biological functions. The natural products inhibit certain kinases, and enzymes. In our study, we found all these six compounds had enzymatic inhibition properties and Compound 3, in particular, showed kinase inhibition property.
Similarly, overexpression of ion channels in pathophysiology of cancer like diseases has prompted towards discovery of potent ligands with ion channel modulators. In our study, Bisacurone (0.07), and Compound 3 (0.53) modulated the ion channels. Protease inhibitors, on the other, can prevent tumor progression and carcinogenesis and via blocking or altering the access to enzyme's catalytic site [19]. We found Compound 3 (0.33), EF31 (0.12), UBS109 (0.08), and Bisacurone (0.14) having protease inhibitor activity.
The Bioactivity scores identi ed Compound 3 (Pumiliotoxin from C. caesia Roxb.) to have maximum biological activity (score>0.0). This was followed by Bisacurone, EF31, UBS109, Compounds 88 and 91. We hypothesize that the physiological action exerted by these compounds could be due to interactions with GPCR and nuclear receptor ligands, modulating ion channel receptors and inhibiting protease, kinase and other enzymes. Table 7 demonstrates the ADME properties of 6 compounds under study using Pre-ADMET and pkCSM tools. Distribution (D)-The plasma protein binding (PPB) correlates to lipophilicity and dependent on concentration and number of binding sites of target protein. The degree of PPB a nity is directly proportional to the e cacy of bioactive compounds. The most important proteins involved in drug binding are-human serum albumin, alpha1-acid glycoprotein and lipoproteins. A compound being more lipophilic exerts stronger plasma-protein binding. The PPB assessment predicted that the Compounds 91 bonded strongly with plasma proteins (100%) while Compound 3, 88, and Bisacurone showed moderate a nity (85.74-88.34%) and EF31, UBS109 (54.10-58.73%) showed weak a nity with plasma proteins. In general, a weak interaction with plasma protein indicates unrestricted transport across the cell membrane and free biomolecules to interact with the target and other proteins reducing bioavailability. The bloodbrain barrier (BBB) protects the brain from exogenous compounds. A drug's ability to cross the bloodbrain barrier may, on one hand, increase toxicity when distributed through Central Nervous System (CNS) while on the other hand, may improve the drug functionality especially for targeting brain metastasis. The BBB penetration revealed, except EF31 (low<0.1), Compounds 91 and 3 showing higher absorption (>2.0), and Compound 88, UBS109, and Bisacurone showing moderate absorption (0.1-2.0) in CNS which quali es them for eliminating cancer cells in the brain.
Metabolism (M)-The CYP450 system essentially aids in drug metabolism (M) and detoxi cation of substances from our system. Drugs that interact with CYP450 might get metabolized by one or multiple CYP450 enzymes and these drugs may either inhibit or induce cytochrome system. The inhibitors often results in unwanted drug interaction and delays the effect of candidate drugs. In our study, Compound 91 showed inhibition towards CYP2C9; Compound 88 inhibited CYP2C9, CYP3A4 while Bisacurone inhibited CYP2C19, CYP2C9. Selective inhibition of cytochrome P450 enzymes suggest that the compound may not exert higher toxicity or cause unwanted drug interactions. Compound 3, EF31, and UBS109 were noninhibitors of cytochrome showing complete metabolism of these compounds. None of the compounds inhibited all the enzymes of the CYP450 system.
Excretion (E)-The elimination process of a drug molecule is termed as 'clearance' which is most predominantly performed by the kidneys post metabolism in the liver. The total clearance log(CL tot ) with respect to hepatic and renal clearances as performed by pkCSM server, predicted that the rate of excretion was higher for Bisacurone (1.382); moderate for Compounds 88 (1.074) and 91 (1.064) and least for Compound 3(0.938), EF31(0.816), and UBS109 (0.682). Low clearance may be indicative of slow metabolism and thus increased half-life in the system. The Renal OCT2 (Organic Cation Transporter 2) is another excretion parameter determined by pkCSM. These protein transporters are important for renal uptake, disposition and clearance of a drug molecule. An OCT2 substrate can cause adverse interactions with inhibitors. None of the six compounds were found to be a substrate for Renal OCT2 thus predicting possibility of OCT2 dependent renal clearance of these compounds.
The Table 8 enlists the 6 compounds which are candidate for lead-likeness, and their toxicity. EF31 and UBS109 were predicted to be mutagen via Ames test but they were non-toxic to rat and mice models.
Compound 88 was a non-mutagen; although it was found to be toxic for rat model but non-toxic for mice model. Compound 3, 91, and Bisacurone were non-mutagen and nontoxic. Additionally, the analysis of hepatotoxicity of the six compounds by pkCSM predicted only UBS109 was hepatotoxic. Besides this, the maximum recommended tolerated dose (MRTD) that estimates the threshold of dose producing an 'acceptable level of toxicity' was found to be higher for Bisacurone (0.705 (>0.477 log mg/kg/day)) and lower (≤0.477 log mg/kg/day) for Compounds 3, 88, 91, EF31 and UBS109. Bisacurone may thus have the potential to cause an adverse effect which cannot be predicted unless tested in animal models. These six compounds were also found to satisfy lead-like criteria [250≤MW≤350; logP≤3.5; rotatable bonds≤7].

Prediction of Metabolic sites of Ligands
The metabolic sites of the compounds are shown in Fig.2. Among the 6 compounds, Compound 91 and 3 showed high metabolic sites at-carbon atom number 3 (score: 0.577), carbon atom number 1 (score: 0.577), and oxygen atom number 20 (score: 0.625), carbon atom number 19 (score: 0.568) respectively. Compound 88, EF31, UBS109 showed moderate metabolic sites at carbon atom number 8 (score: 0.525); carbon atom number 13 (score: 0.535), and 11 (score: 0.535); carbon atom number 22 (score: 0.519), and 3 (0.519) respectively. Bisacurone was seen to have its metabolic site at oxygen atom number 15 (score: 0.477). All these sites interpret that the compounds could have the potential to initiate and carry out catalytic reactions that impact the various cellular functions, on administration.

Protein Targets for Curcuminoids
The COSMIC database was used for selection and retrieval of proteins associated with haematologic malignancies. To perform molecular docking we selected 8 proteins viz. abl1 (v-abl Abelson murine leukaemia viral oncogene homolog 1), myc, max (myc-associated factor X), myb, pcna (Proliferating cell nuclear antigen), top3a (Topoisomerase 3α), p73, and blm (Bloom syndrome); molecular aberrations of which are known to be associated with malignant transformation (as per the mutational pro ling) (supplementary Links and Table S1-S8).
The PDB IDs of the crystal structures of these proteins were-2F0O (abl1); 6GCK (myc and max); 1U7B (pcna); 4CGY (top3a); 1DXS, 2WQI, and 2XWC (p73); 4O3M and 5LUP (blm). The 3D structure of myb protein was modelled using Phyre2.0 and was validated using PROCHECK Ramachandran Plot analysis server. Approximately 55.2% residues were found to be in favoured region for myb (supplementary Fig.  1(A-B), supplementary Table S9). The active sites of all the proteins were predicted using MetaPocket 2.0, the results of which are tabulated in supplementary Table S10-S17. The output revealed that polar residues occurred at high frequency in active site architecture and participated in ligand binding by formation of hydrogen bonds. Fig.3 shows the contribution of different classes of amino acid residues at binding sites of ligand molecules.
For abl1 protein, the binding site was enriched with valine (non-polar, aliphatic) and tyrosine (aromatic), and the active residues were distributed in binding pocket 1.In myc protein, the active site residues were observed in binding pocket 1, and the site was enriched with phenylalanine (aromatic), and lysine (positively charged, basic, polar, hydrophilic). In case of max, the most abundant active site residues were lysine, and arginine (positively charged, basic, polar, hydrophilic), glutamate (negatively charged, acidic, polar, hydrophilic), and glycine (non-polar, aliphatic), all of which were present in binding pocket 5. The residues leucine, and proline (non-polar, aliphatic), serine, and threonine (polar, non-charged) were ample in active site of myb protein and resided in binding pockets 1, 2 and 5. For pcna, glutamate (negatively charged, acidic amino acids, polar, hydrophilic) was abundant in binding pocket 1.The residues asparagine, and threonine (polar, non-charged), leucine, and proline (non-polar, aliphatic) were rich in binding pockets 1 and 2 of top3a. In case of, p73 1DXS , the prominent residues were glutamate (negatively charged, acidic, polar, hydrophilic), leucine (non-polar, aliphatic), and serine (polar, non-charged); located in pockets 1, 2, 3, and 5. For p73 2XWC , the binding pocket 1 was abundant with proline (non-polar, aliphatic). The residues leucine (non-polar, aliphatic) and arginine (positively charged, basic, polar, and hydrophilic) were present in binding pocket 1 of blm 5LUP .  Table S18.

Virtual screening of ligands with proteins involved in Haematologic Cancers
The preliminary analysis of 6 curcuma compounds (Compounds 91, 3, 88 and EF31, UBS109, Bisacurone) with desired molecular, biological and druglikeness properties against the target proteins (abl1, myc, max, myb, pcna, top3a, p73, and blm) was performed using AutoDock Vina to check for binding a nity. Table 9 enlists the binding a nity scores that ranged between -4.4kcal/mol to -8.8kcal/mol for the 6 ligands and 8 proteins under study. The maximum binding e cacy was exerted by black turmeric Compound 88 (supplementary Fig. 3). This was followed by Compound 91, UBS109, EF31, Compound 3, and Bisacurone according to their descending order of binding a nity.

Analysis of ligand similarity
The Compound 3, Pumiliotoxin of C. Caesia Roxb., was aligned against ATRA (All Trans Retinoic Acid) using LS-align tool due to their conformational similarity as identi ed in our previous investigation (unpublished). The Compound 3 was submitted as query ligand and ATRA was submitted as template ligand. The PC score based rigid and exible LS-align algorithm identi ed 12 out of 14 aligned pairs with identical atom type with distance <1Å. Thus, Compound 3 and ATRA were found to share approximately 85% atomic identity (supplementary Fig. 4-5). This was also followed by molecular docking of ATRA with proteins of interest (myc and p73) as was compound 3. The sites of interactions of ATRA to the two target proteins were in concordance with Pumiliotoxin to myc and p73 respectively (supplementary Fig.  6).

Molecular Docking of Compounds and Proteins via AutoDock Tools
The results obtained from AutoDock Vina were nally con rmed via molecular docking of 6 curcuma compounds against 8 proteins using AutoDock Tools. The results showed that the curcuma compounds were agonistic to the target proteins. Interestingly, while we considered the minimum hydrogen bond distance between the active pocket residues of proteins and ligands, the best docking results were observed for the following protein-ligand complexes-Compound 91-abl1 (-6.14kcal/mol), and max (-5.29kcal/mol); Compound 3-myc (-4.48kcal/mol), and p73 1DXS (-4.90kcal/mol); Compound 88-myb (-6.49kcal/mol), and blm 5LUP (-4.87kcal/mol), UBS109-myb (-7.41kcal/mol); EF31-pcna (-6.76kcal/mol); Bisacurone-top3a (-4.52kcal/mol). These complexes were further evaluated and discussed to understand their effect on modulating the MMR cascade. The calculated best binding energy, inhibition constants, and hydrogen (H) bond forming residues in protein active site along with the bond distances are summarized in Table 10.
The ligands were mainly found to interact with target proteins by means of hydrogen (H) bond. Additionally, some of the protein-ligand complexes-Bisacurone-abl1; UBS109-blm 5LUP , top3a, and p73 2XWC , Compound 3-myb, and Compound 88-p73 2WQI did not form any hydrogen bond with the active residues but interacted through other interatomic interactions viz., van der Waals interaction, C-H bonds, side chain donor, backbone donor and acceptor, pi-pi stack, pi-alkyl/alkyl, pi-sulphur, pi-sigma, pi-amide, and pi-cation/anion interactions. This was corroborated by Zhao and Huang (2011) who observed that Hbond alone might not be necessarily important for protein-ligand interactions [20]. Fig. 4 represents the best docked complexes for each ligand to the target proteins with respect to minimum H-bond distance.
Overall, the binding energy scores of six ligands with target proteins ranged between -3.33kcal/mol to -7.25kcal/mol (Table 11). According to their descending order of binding energy, UBS109 was found to be best t for most of the target proteins (supplementary Fig. 7). This was followed by EF31, Compounds 88, 91, 3 and Bisacurone respectively. The 8 proteins docked with curcuma compounds were further studied for their interaction with mismatch repair system. STRING was used to carry out the interaction analysis.

Network Analysis to Understand Proteins involved in Haematologic cancers and their interactions with MMR
The STRING analysis identi ed the interaction of abl1, myc, max, myb, top3a, pcna, p73, and blm (score>0.4) with mismatch repair proteins as interpreted by 'experimental' and 'coexpression' channels that analysed the data against KEGG. However, the 'text mining', 'Databases' and 'Co-occurrence' channels also helped in identifying additional inter-connection among the genes. Fig.5 shows the interactive network of the eight proteins with any of the 9 proteins of MMR system.

Discussion
Cancer research has entered into an era of targeted therapeutics involving monoclonal antibodies, kinase inhibitors and immune checkpoint blockades. Despite targeting cancer associated biological pathways these treatments are limited by toxicities. In blood cancer, the current research is focused on CAR-T/NK treatments [21]. However, for over three decades now, plant products have secured a place in cancer treatment and these natural products have certainly come a long way as anticancer drugs. Curcumin was the rst compound to be administered to human subjects in the year 1987 to observe its e cacy against cancer. Since then, it has been evaluated for wide range of biological activities in clinical perspective. Besides being cost-effective and capable of targeting multiple pathways, the curcuminoids limit treatment acquired resistance, show minimal side effects and might be used alone or in combination with existing therapies [22]. In the present study, we attempt to identify the e cacy of curcuma derived natural and synthetic compounds (Curcuma longa L. and Curcuma caesia Roxb.) against targeted proteins that cause deleterious consequences in haematologic malignancies via computational tools. Additionally, we infer that these proteins following the interaction with curcuma biomolecules may revive their own function and further rescue the expression of deregulated, non-mutated MMR protein in cancer. After initial screening of 536 curcuma compounds and further selecting 30 compounds based on their molecular, biological, and drug-like properties, we nalized 6 biomolecules viz., Compounds 3, 88, 91 from C. caesia Roxb., and EF31, UBS109, Bisacurone from C. longa L. The nal docking with AutoDock tools helped us to recognize the H-bond dependent a nity of 6 curcuma compounds with eight targets. The overall docking analysis thus revealed the signi cance of the amino acid residues at active sites in creating a 'local environment' that aided recognition and binding of the ligands with target proteins. The gure below (Fig. 6) shows the possible implication of curcuma derivatives in rescuing MMR machinery during cancer treatment.
It is evident that the distinct types of leukaemia correlate with various forms of BCR-ABL oncogene which in turn activates MAPK, PI3K/Akt, NFκB, and STAT5 signaling pathways responsible for survival and proliferation of leukemic stem cells (LSCs) [23]. Piekarska et al. (2018) reported overexpression of abl1 in Philadelphia like ALL cases [24]. Despite availability of known tyrosine kinase inhibitors (TKIs) viz., imatinib, dasatinib, nilotinib, bosutinib, and ponatinib; the development of resistance due to acquisition of bcr-abl kinase domain mutations accompanied by toxic side effects, costs, and safety issues have subdued the fanaticism of using these drugs as 'better choice' in both CML (accelerated phase and blast crisis) and Ph + ALL (Philadelphia positive Acute Lymphoblastic Leukaemia) cases [25][26][27]. The TKIs are also often not effective for genetically complex leukaemic cases due to the development of resistance towards TKIs. There are also possibilities of development of secondary malignancies following treatment with imatinib [28][29][30]. Compound 91 was identi ed to be the best t for abl1 (H-bond: Val 92 , distance: 3.49Å) as understood through hydrogen-bond distance. The safety and e cacy of a modi ed Compound 91 in future can make it a good candidate to target abl1 differently than the current available drugs. Here, through STRING, abl1 shows direct interaction with MMR protein msh5 [31]. Over-expression of msh5 has been reported in BCR-ABL positive CML [32]. Structural variations of MSH5 gene has been reported in Tcell ALL in a study conducted by Zhang and research group (2012) [33]. Disruption of functional msh5 protein leads to altered mismatch repair and affects double-strand break repair (DSBR) repair pathways (Fig. 7). The BCR-ABL positive CML (Chronic Myeloid Leukaemia) cells also lower the expression of mlh1 and pms2 and induce point mutations thus affecting the mismatch repair mechanisms [34,35]. Here we hypothesize that Compound 91 might pose a signi cant impact to eradicate the mutational events and overexpression of abl1 along with its fatal consequences on mismatch repair proteins involved in DSBR mechanisms. In vitro and in vivo studies have shown that the binding of max with msh2, and myc with mlh1initiates the DNA repair process by formation of heterodimeric protein complexes [36]. A mutant max might lead to partial or complete loss of msh2 function, thus affecting MMR and causing loss of cellular apoptosis and genomic instability. It also affects the function of myc in normal and neoplastic conditions since myc forms sequence-speci c DNA-binding complex with max [37]. Compound 91 of C. caesia Roxb.
was identi ed to be a decent agonist for max (H-bond: Lys 153 , distance: 2.79Å). Therefore, this compound can possibly help to repress an overexpressed max by inhibiting myc-max dimerization and thus reducing DNA binding potential, and transcriptional activity of these proliferators in blood related cancers. Simultaneously, low expression of max will revive the function of msh2, and rewire the MMR. To summarize this in silico data, Compound 91 can potentially target abl1, max, and myc pathways and revive DNA repair mechanisms.
The constitutive dysregulation of myc protein is associated with its overexpression and poor prognosis in majority of human cancers including blood cancers. Downregulating myc has thus been a prime goal in anticancer therapies [38][39][40] and it can be an ideal therapeutic target in haematologic malignancies as well. In the present study, we found Compound 3 (Pumiliotoxin of C. caesia Roxb.) having a good association with myc (H-bond: Glu 935 , minimum distance: 4.91Å). Interestingly, myc also associates with mlh1 to regulate the mismatch repair [36]. Thus targeting myc with Pumiliotoxin (Compound 3) in blood cancer might help to upregulate mlh1 and direct the execution of damage recognition and repair. An interesting fact that came to light through our earlier studies (unpublished) is that, Compound 3 has a high resemblance to ATRA (All Trans Retinoic Acid) structurally as also inferred by LS-align here (Supplementary Fig. 4 and 5). ATRA has a role in downregulation of pin1 (Peptidylpropyl Cis/Trans Isomerase, NIMA-Interacting 1) in acute myeloid leukaemia (AML) [41]. Surprisingly, pin1 physically interacts with myc and both pin1 and myc are overexpressed in multiple cancers [42]. Downregulation of myc and pin1 via ATRA is already known [43,44]; however due to its short half-life it is not a very effective anticancer therapy. Modi cation of Pumiliotoxin may therefore yield a novel and target driven future drug. Overexpression of p73 and its loss of expression as a result of hypermethylation were earlier reported in leukaemias and lymphomas [45,46]. Hence, protein p73 which is rarely mutated but frequently deregulated in cancer especially APL, requires therapeutic intervention [47]. Here, we identi ed Compound 3 (-4.9kcal/mol; Asp 41 -2.95Å) to be best suited for binding with p73 and protecting the function of pms2. Under normal conditions, mismatch repair protein pms2 stabilizes p73 to stimulate p73-dependent apoptosis [48]. The requirement for pms2 in damage-induced activation of p73 is evident for direct signaling function of MMR proteins. Besides this, pms2 is a binding partner of mlh1. ATRA could be a potent modulator of aberrant p73 expression in haematologic cancers [47]. As mentioned above, structural resemblance has been found between Compound 3, and ATRA. Thus, with proper modi cation of Compound 3, it may work as a good modulator of myc and p73 and aid the revival of MMR in various cancers.
The overexpression, recurrent translocation and duplication of myb has been reported in AML, ALL, acute basophilic/myelomonocytic leukaemia, and adult T-cell leukaemia [49]. The MYB gene is indirectly connected to MMR via MYC and ABL1. A number of researches con rms binding of myb to the promoter regions of myc and directly regulates the expression of myc protein [50]. The oncogenic myc and bcl-2 are known to be direct targets of myb [51]. This interdependency of myc and myb can be explored for therapeutic targeting. Similarly, bcr-abl1 transformed myeloid and lymphoid cells rely on aberrant expression of myb causing 'addiction of leukemic cells towards myb' [52]. In our study we theorize, that indirect downregulation of abl1 and myc through Compound 91 and 3 respectively can alter the function of myb. Alternatively, we identi ed Compound 88 (H-bond: Gln 274 , distance: 2.83Å) from C. caesia Roxb., and UBS109 (H-bond: Lys 587 , distance: 3.07Å) from C. longa L. which prompted towards favourable binding with myb. These may modulate myb which in turn may downregulate the E2F1 transcriptional factor involved in creating a 'second wave of transcription' for progressing through aberrant cell cycle during cancer. The MMR genes MSH2 and MLH1 are known targets of E2F1 [53]. While myc targets mlh1 and abl1 targets msh5 respectively, there is not direct interaction between myb and MMR proteins. These pathways may be explored in future for the various anticancer therapies. Bloom syndrome patients, develop haematologic malignancies frequently [54,55]. The yeast-two hybrid assay, coimmunoprecipitation and far western analysis con rmed the C-terminal region of blm to interact directly with mlh1 to maintain genomic stability [56]. Besides this, blm is also known to be regulated by msh2-msh6 heterodimeric complex [57]. We found black turmeric Compound 88 docked best with protein blm (-4.87kcal/mol; H-bond: Glu 377 -2.85Å, Arg 407 -3.66Å, and Cys 380 -4.96Å). Hence, this natural compound of black turmeric might pose a signi cant impact on non-mutant deregulated blm expression such that its negative impact on MMR can be nulli ed.
PCNA is a central component of DNA replication and repair that interconnects MMR proteins msh3 and msh6. The pcna-msh3-msh6 complex, upon stacking on DNA, activates human MutS and MutL (MSH2-6 and MLH1,3; PMS1,2) components [58]. The elevated expression of pcna has been observed in multiple cancers including CML and CLL which correlates with poor survival [59]. A combined treatment of curcumin and doxorubicin was found to reduce expression of pcna in liver cancer [60]. Similar research investigated that curcumin alone or in combination with gemcitabine can suppress abnormally expressed PCNA in pancreatic cancer cells [61]. In present study, EF31 formed best docked complex with pcna via Hbond formation with active residues Glu 124 (3.32Å) and Glu 25 (3.20Å). This suggests that the mutational effect of pcna can possibly be downregulated by EF31 to restore the MMR functionality but with minimum side effects that are exerted by doxorubicin or gemcitabine like drugs.
The top3a proteins though do not directly interact with MMR proteins, the BTR (BLM-TOP3A-RMI1/2) complex including blm and top3a are involved in DSBR [62,63] wherein blm is known to interact with mlh1, msh2 and msh6. Overexpression of Topoisomerase has been recognized in multiple malignancies [64,65]. Here, Bisacurone showed its best binding a nity towards top3a by forming H-bond with active residues Tyr 377 (H-bond distance: 3.11Å) and Asn 406 (H-bond distance: 5.2Å). Thus we postulate, targeting mutant top3a with Bisacurone while also targeting blm with Compound 88 and UBS109, may cumulatively help in regulating the abnormal expression of top3a thereby impacting downstream effectors-proteins.
The gure below (Fig. 7) is a diagrammatic representation of the cancer-related pathways that can be targeted with curcuma compounds mentioned in this study, in order to protect and rewire the DNA found that curcumin not only increased the potency of 5-FU in a dose-dependent manner but also reduced the proliferation of MMR de cient tumor cells [66]. Chen et al. (2003) identi ed anti-leukemic mechanism of curcumin that elicited an increased expression of mismatch repair genes MLH1 and MSH2 followed by cellular apoptosis [67]. Jiang and colleagues (2010) investigated that MMR de cient CRC cells shows higher sensitivity towards curcumin which can be attributed to deregulation of multiple signaling cascades. Although curcumin induced oxidative damages were independent of MMR status; the activation of Chk1/2 and G2/M cell cycle arrest by curcumin requires intact MMR function [68]. From this computational study, we can suggest that Compounds 3 and 91 of C. caesia Roxb. had best drug like properties considering their interaction with myc, max and abl1 respectively; the major contributors in emergence of haematologic malignancies. Additionally, Compound 88 and UBS109 bound well with protein myb. Recent investigations reported e cacy of UBS109, EF31 and Bisacurone against pancreatic cancer growth and breast cancer metastasis which majorly act upon NFκB and inhibit this cascade by suppressing IKKα and β [69][70][71]. However, the structure of UBS109 needs modi cation to reduce the mutagenecity and hepatotoxicity as predicted by the in silico tools in present study.
In a nutshell, the bioinformatics analyses revealed promising e cacy of the curcuma compounds against selective oncogenes and tumor suppressor gene; aberrations of which may possibly lead to deregulation of MMR system along with perturbation of functional inter and intra-molecular network. This study highlighted the signi cant protein-ligand interplay through various interatomic interactions and demonstrated the possible molecular mechanisms underlying the docking of these compounds with target proteins in haematologic malignancies. To our knowledge, till date there are no reports that computationally explored the anticancer potential of curcumin based ligands in haematologic malignancies with a focus on DNA mismatch repair machinery.

Methodology
The diagrammatic representation in Fig. 8 depicts the ow of work. The retrieval of chemical compounds and their docking with selected targets were carried out. This was followed by deriving the interaction map of target and MMR proteins effectively proving the impact of curcumin compounds on the function of MMR.\

Calculation of Molecular Properties and Bioactivity of ligands
The molecular properties and bioactivity of ligands were calculated using Molinspiration (https://www.molinspiration.com/). Fifteen descriptors analysed by Molinspiration were-molecular weight, logP, topological polar surface area (TPSA), volume, number of atoms, rotatable bonds, hydrogen bond donors and acceptors, range of violations to Lipinski's rule, and bioactivity. The bioactivity properties include-GPCR (G-protein coupled receptors) and nuclear receptor ligand, ion channel modulator, kinase, protease and enzyme inhibitors [76].
All the biomolecules from various sources of Curcuma that satis ed the above criteria were further evaluated for energy minimization, virtual screening and docking.

Prediction of Sites of Metabolisms (SoMs) of compounds
FAME3 (https://nerdd.zbh.uni-hamburg.de/fame3/), the FAstMEtabolizer program, predicts the sites of metabolism (SoMs) in the atoms where a metabolic enzyme initiates a catalytic reaction [80]. In this study, prediction of such active sites gave us information of number of functionally interactive sites in the phytochemicals and in future can aid in designing drug derivatives.

Retrieval and preparation of target proteins
The COSMIC (https://cancer.sanger.ac.uk/cosmic) database was used for selection of genes associated with haematologic malignancies [81]. A brief mutational pro ling of the selected genes was carried out using PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/) and Meta-SNP (https://snps.biofold.org/metasnp/) [82,83]. All the genes under study showed deleterious mutations in their coding regions suggesting that they contribute towards tumor initiation in the blood tissues they originate in. The search was restricted to blood-related cancers since it is the focus of this study. The RCSB Protein databank (https://www.rcsb.org/) was used to extract the crystal structures of the proteins of interest. Models were computed using Phyre2 for the proteins lacking a crystal structure in databank and validated using PROCHECK Ramachandran plot analyser (Protein Homology/analogY Recognition Engine2.0;http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi? id=index; https://servicesn.mbi.ucla.edu/PROCHECK/). Finally all the hetero-atoms, water molecules and additional chains were removed from protein structures using Discovery Studio Visualizer prior to virtual screening and molecular docking.

Protein Active Site prediction
Prior to docking, the prominent active binding sites of each protein molecule were predicted using MetaPocket2.0 (https://projects.biotec.tu-dresden.de/metapocket/index.php). The binding pockets, consisting of active residues for each protein, were identi ed which were later analysed compared to the docking results [84].

Domain analysis of proteins
MOTIF Search (https://www.genome.jp/tools/motif/) and ScanProsite (https://prosite.expasy.org/scanprosite/) were used to nd the motifs/ domains of target proteins which may help in understanding the activity of the proteins

Energy minimization of ligands and proteins
The energy minimization of selected ligand molecules and proteins were performed using YASARA (Yet Another Scienti c Arti cial Reality Application) (http://www.yasara.org/minimizationserver.htm) which utilizes YASARA knowledge based potential force eld [85].

Virtual screening of ligands
The virtual screening of ligand molecules were performed by AutoDock Vina software which utilizes a 'gradient optimization method' to improve its accuracy in prediction of binding a nity while minimizing the time [86].

Molecular docking
Molecular docking was performed using AutoDock Tools 4.2.1 version [88]. The polar hydrogen was added to the receptor (proteins) followed by addition of Kollman charges and computing Gasteiger charges. The torsions were calculated for respective ligands and both receptor and ligand les were saved as .pdbqt format. The grid optimization was performed using AutoGrid programme and the grid box was centered such that it covers all identi ed active pocket amino acid residues. Docking was carried out using AutoDock programme and ten different conformations were generated with respect to their binding energies. The energy values in AutoDock are calculated on basis of various intermolecular bonds such as-hydrogen bond, desolvation energy, van der Waals, and electrostatic energy, internal energy of ligand, and torsional free energy. Amongst these, the desolvation and van der Waals energy together forms the binding energy; the hydrogen bond and van der Waals energy forms the docking energy and the strength of binding of ligand to the receptor is determined by electrostatic interactions. Complexes having lowest binding energy were considered as the best receptor-ligand structure and were chosen for post docking analysis. The results were visualized using Discovery Studio Visualizer and MOE (Molecular Operating Environment).

Network construction
The STRING database (https://string-db.org/) was utilized to construct the interaction network between the target proteins selected in this study and proteins of the MMR system [89]. This aided in the understanding the impact of targets on the function of the MMR proteins and in future may nd a way to modulate MMR, via the protective effect of curcuma compounds.