A Comprehensive In Silico Perspective For Discovery of Novel Inhibitory Candidates Targeting Versatile Transcriptional Repressor MBD2

Genome methylation is a key epigenetic mechanism in various biological events such as development, cellular differentiation, cancer progression, aging, and iPSC reprogramming. Crosstalk between DNA methylation and regulation in gene expression is employed through MBD2, known as reader of DNA methylation and suggested as a drug target. Despite its magnitude of signicance and rationale of nomination, a scarcely limited number of druggable ligands has been detected so far. Hence, we screened a comprehensive compound library, and then certain of them were subjected to computational docking analysis by targeting the methylated DNA-binding domain of human MBD2. We could detect reasonable binding energies and docking residues presumably located in druggable pockets. Docking results were also validated via MD simulation and per-residue energy decomposition calculation. Drug-likeness of tested ligands was assessed through ADMET prediction in order to foresee off-target side effects for future studies. Herein, on the basis of collaborating approaches such as molecular docking, MD simulation, energy decomposition, and ADMET prediction, notably two compounds named CID3100583 and 8,8-Ethylenebistheophylline, have become prominent as novel candidates, possibly disrupting MBD2 MBD –DNA interaction. Hereby, these compounds exhibit a promising usage potential in a wide range of implementations from cancer treatment to somatic cell reprogramming protocols.


Introduction
Epigenetic regulation is a key process in which involves various mechanisms, dynamically altering and specifying gene expression and cell fate (Aloia, 2021). As a predominant epigenetic mechanism, DNA methylation involves covalent attachment of methyl groups (-CH 3 ) from S-adenosyl-L-methionine as the methyl donor to the carbon-5 positions (C5) of the cytosine residues within a context of cytosine-guanine (CpG) dinucleotides resulting in 5-methylcytosine (5-mC) analogue (Moore et al., 2013). Principally, three catalytically active DNA Methyltransferases in mammals, DNMT1, DNMT3A, and DNMT3B, maintain an overall DNA methylation pattern speci c to either cell type, developmental period, and differentiation status (Gowher & Jeltsch, 2019;Jurkowska & Jeltsch, 2016). Methylation on CpG dinucleotides near and/or inside promoter sequences reveals a well-known hallmark for DNA-binding proteins, and thus gene transcription is repressed due to avoidance of transcription factors and machinery (Kribelbauer et al., al., 2002). Herein, Methyl-CpG Binding Domain (MBD) family members, also known as "readers of DNA methylation" (H. Zhu et al., 2016), are outstanding transcriptional repressors, which facilitate crosstalk between DNA methylation and histone deacetylation (Baubec et al., 2013). All members of MBDcontaining protein family in mammals share a highly conserved methylated DNA-binding domain, consists of 70-85 amino acids (Stirzaker et al., 2017;H. Zhu et al., 2016).
MBD2-mediated epigenetic regulation has been implicated in important biological functions such as neuronal activity and development, proliferation and maturation of various immune cells, hematopoiesis, pluripotency, and cancer initiation and progression (Ginder & Williams, 2018;Menafra & Stunnenberg, 2014;. For instance, it has been evidenced that MBD2 augments metastatic potential, cell proliferation, and growth by especially downregulating hypermethylated tumor suppressor genes in various tumor types (M. Y. Kim et al., 2019;Li et al., 2020;Martin et al., 2008;Mian et al., 2011;. Hereby, KCC-07, a brain-permeable small molecule inhibitor of the MBD2, was clearly demonstrated to inhibit medulloblastoma (MB) cell growth in vitro and reduce tumor volume in xenograft models through the reactivation of BAI1/p53 pathway (D. Zhu et al., 2018). Additionally, MBD2 depletion together with DNMT inhibitor (5-azaCdR) treatment in breast cancer cells resulted in synergistic antiproliferative and anti-invasiveness effects through recovery of hypermethylation-mediated repression of apoptotic genes (Cheishvili et al., 2014). In other respects, MBD2 was shown to facilitate epigenetic suppression of pluripotency and self-renewal related genes such as Nanog, Oct3/4, and c-myc during the commitment of ESCs, consistent with its higher expression level in terminally differentiated cells. Besides, MBD2 is primarily downregulated by essential miRNA clusters at the early stage of somatic cell reprogramming and its depletion through speci c siRNA increases the e ciency of iPSC production.
Therefore, MBD2 stands out as a potential target in especially treatment of cancer (Ginder & Williams, 2018), and regulation of pluripotency-differentiation balance in iPSC reprogramming (Lu et al., 2014).
Although the structure of MBD2 and its function in several biological processes and disease progression have been well-characterized suggesting as key targets, the number of available inhibitors against that epigenetic regulator is still barely limited and therefore submission of novel potential ones maintains its importance yet. To date, NF449, Mitoxantrone, Idarubicin, Aurintricarboxylic acid have been proposed as preventing MBD2 from methylated DNA binding (Wyhs et al., 2014), and KCC-07 (commercially available) and KCC-08 have been biologically tested in vitro and/or in vivo (Giovinazzo et al., 2018;D. Zhu et al., 2018). Hence, we aimed to evaluate novel inhibitor candidates targeting MBD2 MBD -DNA interaction in order to reverse transcriptional silencing on crucial genes such as tumor suppressors, pluripotency factors, and etc. for multiple purposes. In this regard, we accomplished a comprehensive screening to reveal small molecules as potential inhibitors in this study. Our results have pointed these small molecules prominently docked signi cant residues such as Val177, Arg188, Lys190, Phe206, Phe208, and Arg209 in DNA-binding domain, predicting an interrupting impact on MBD2-methylated DNA recognition and interaction. Furthermore, Molecular Dynamics (MD) simulation validated MBD2 MBD -ligand interaction by meaningful RMSD, RMSF, B-Factor values, and energetics. Besides, it has been noticed that predicted 3-dimensional binding pockets on MBD2 MBD and residues in which ligands occupied, were considerably overlapped.
Taken together, extracted small molecules have emerged as potential inhibitor/drug candidates with binding a nity and inhibition constants in a range of -7.5-(-)11 kcal/mol and 4-1880 nM, respectively, as shown computationally. Hereby, the current study has provided substantial data, particularly for compounds CID3100583 and 8,8-Ethylenebistheophylline which might lead to in vitro and/or in vivo studies testing their applicability in cancer treatment and somatic reprogramming protocols.

Docking Based Virtual Screening To Elicit Potential Binders
A screening library was prepared from ZINC15 database compounds. We regarded certain criteria such as logP value, MW, and charge for selection of molecule tranches. Wyhs et al. had demonstrated Mitoxantrone,Idarubicin,NF449, and Aurintricarboxylic Acid impaired MBD2 MBD binding to methylated DNA in TR-FRET assay however mechanism of action remained unclear (Wyhs et al., 2014). Since these compounds predominantly have charges, logP, and MW ranging from -3 to +2, 0 to 3, 400 to 500 Da, respectively, we exploited related characteristics, emerging as a screening strategy to start out. Besides, logP, MW and charge were retained moderately in order to explore ligands with favorable solubility in aqueous vehicles, convenience for passing through the plasma membrane and intracellular delivery for potential use in prospective in vitro/in vivo studies. 15,893 different compounds yielded numerous docking poses including 120,310 available conformations with reasonable binding a nities. Among these, certain hit compounds with docking score over -5.0 kcal/mol which was pointed according to structure and/or biological activity (if available), were listed in Supplementary Table S1. Although most of the compounds have not been annotated, assayed and/or functionally identi ed, already identi ed/annotated hits were also recovered by the virtue of screening. Most of the hits have been attained to inhold heterocyclic groups, especially carboxamide, quinoxaline, imide, benzene, thiazole, piperazine, naphthalene, pyridazine, isoquinoline, pyrrole, pyrimidine/pyridine, furan, aniline, imidazole (see Supplementary Fig. S1). Moreover, ligands composed of xanthine, guanine, adenine or their derivatives and/or methylated groups, were also quite notable in terms of implying a plausible compatibility between potential binders and methylated DNA-binding domain.

Elaboration of Target Sites of Selected Leading Compounds Throughout MBD2 MBD
The existence of signi cant structures and groups within hit compounds (see Supplementary Fig. S1) led us to determine binding regions more comprehensively by computational docking analysis. Thereby, selected ligands with higher screening scores were docked against re ned receptor corresponding to human MBD2 MBD (PDB: 7AO8, Chain C). Among 10 of unidenti ed and 5 of annotated compounds, CID3100583 exhibited the best docking pose with -11.42 kcal/mol, 275.20 Å, and 4.22 nM as binding energy, RMSD, and estimated inhibition constant (K i ), respectively ( Table 1). It was followed by CID343482, CID136748749 as proximately. Nevertheless, estimated K i of CID3100583 (4.22 nM) was ahead of other candidates as roundly 8-fold (Table 1), suggesting less amount could be su cient for a half-maximal inhibition on MBD2 MBD -DNA binding in actual fact. Correspondingly, each of CID3100583 and CID343482 constituted 9 H-bonds (Table 2) in company with 2 hydrophobic interactions and 3 hydrophobic interactions, respectively (Table 3). Albeit both ligands established the same number of H-bonds, more diverse amino acids contributed interaction with CID3100583 atoms. 8,8'-Ethylenebistheophylline (Ebis-Theophylline) committed the strongest binding energy (-10.32 kcal/mol) with the lowest K i (27.30 nM) and the second rank among annotated drugs and all compounds, respectively ( Table 1). Although Regadenoson and Glucosylhydroxymethylcytosine (GHMCytosine) formed 9 H-bonds (Table 2), their binding a nities and estimated K i were relatively lower and besides less number of divergent amino acids participated in binding sites in a similar way with others (Table 2 and 3).
Since ligand docking may trigger conformational changes on protein, we aimed to better re ect the dynamics nature of protein-ligand interaction and local rearrangements of side-chains of active site amino acids via induced-t docking (IFD) (Antunes et al., 2015;Huang & Zou, 2010) analysis. As a result of IFD ( exible receptor; exible ligand), similar binding-pocket residues were appointed ( Fig. 3c and 3d), consolidating our initial rigid docking results (rigid receptor; exible ligand) again. We detected various transition states of amino acid side-chains, particularly surrounding CID3100583 and Ebis-Theophylline zones between apo (unbound) receptor and holo (ligand-bound) complex ( Fig. 3a and 3b). It was elicited that Arg166, Asp176, Tyr178, Asp207, Phe208 became closer to CID3100583 atoms upon formation of the holo complex while Val177, Arg188 underwent a minor secession ( Fig. 3a; see Supplementary Movie S1 for distances). A similar adjustment was also mapped for complex with Ebis-Theophylline according to unbound state (Fig 3c; see Supplementary Movie S2). Additionally, we re-calculated binding energies by using PRODIGY-Ligand interface (Vangone et al., 2019) in order to eliminate differences outputted by different software and to compare whether a nity could establish any alteration in IFD. Binding energies (kcal/mol) were listed as -5.21, -7.52, -5.28, -6.49 for CID3100583-MBD2 MBD (rigid receptor), CID3100583-MBD2 MBD ( exible receptor), Ebis-Theophylline-MBD2 MBD (rigid receptor) and Ebis-Theophylline-MBD2 MBD ( exible receptor), respectively.
Favorably, amino acids interacted with ligands have been detected highly conserved between MBD1, MBD3, MBD4, and MeCP2 in human (see Supplementary Fig. S3) and MBD2 in different organisms via multiple sequence alignment. Residual counterparts in both MBD1 (Val20, Arg22, Tyr34, Arg44, Ser45) (Ohki et al., 2001) and MeCP2 (Arg106, Arg133, Trp155, and Tyr158) (Ballestar et al., 2000;K. Liu et al., 2018) construct a methylated-DNA binding site as reported, designating a highly conserved DNA binding tendency for MBDs. To test whether ligands exhibited an incidental complementarity with docking site on MBD2 MBD or compounds were detained due to their DNA-like preferability for functional residues, we docked CID3100583 and Ebis-Theophylline against MeCP2 MBD and MBD1 MBD . Among CID3100583 docking residues, Asp121, Val122, Arg133, Thr158 was worthy of note for MeCP2 MBD that these were previously reported as harboring to the backbone or aromatic bases of DNA duplex (Ghosh et al., 2010;K. Liu et al., 2018;Nikitina et al., 2007) while Arg44, Ser45, Lys46 and Lys 65, which mediates the interaction between MBD1 MBD and negatively charged DNA backbone (Ohki et al., 2001;Zou et al., 2012) Fig. S4a-b). Besides, Ebis-Theophylline was also shown to dock the similar and/or adjacent residues on MBD1 MBD and MeCP2 MBD (See Supplementary Fig. S4c-d). For instance, Arg133 and Thr158 were appeared as two common residues on MeCP2 MBD which contributed to both CID3100583 and Ebis-Theophylline docking while Ser134 was shown to involve in van der Waals forces between only Ebis-Theophylline (See Supplementary Fig. S4c). Binding energies (kcal/mol) were calculated as -11.05 (K i = 7.91 nM), -9.52 (K i = 104. Because the mechanism of action remained unclear, it led us to survey the docking residues of the most effective inhibitors KCC-07 and CID3136570 patented by Nelson et al. (Nelson et al., 2010), in a similar manner (see Supplementary Fig. 5). As a reference, binding energies (kcal/mol) were -7.89 (KCC-07) and -7.61 (CID3136570) in our docking analysis, even lower than assumptions for novel inhibitor candidates in Table 1. Surprisingly, we detected Val177, Arg188, Arg209 in H-bonding while Arg166, Tyr178, Lys190, Phe208, Arg209 set hydrophobic interactions with CID3136570. Nonetheless, KCC-07 was shown to interact with the residues between Asp151 and Lys160 (see Supplementary Fig. S5a).

Complexes via MD Simulation
Because CID3100583 has emerged as the most promising druggable ligand in consequence of either virtual screening and docking analysis, we facilitated MD simulation in order to pursue dynamic behavior and to prove exibility/rigidity of protein-ligand complex binding along given time. Along with CID3100583-MBD2 MBD complex tted as superior (Table 1), MD simulations of other ligand-protein complexes with lower docking scores were also conducted as a comparison (Fig. 4a). Apparently, MBD2 MBD conformations with ligands CID3100583, CID343482, CID136748749, CID46959745, CID126001284, CID125615373, CID25826758, CID323153, CID24241 established mean RMSD values 0.89 Å, 0.87 Å, 0.94 Å, 1.04 Å, 1.14 Å, 1.14 Å, 1.27 Å, 0.97 Å, and 0.94 Å, respectively, against the average structure along the trajectory. Likewise, CID3100583-protein complex uctuated slightly through snapshots and followed by CID343482-and surprisingly by CID24241-protein complexes. Despite these complexes were uctuating proximately, CID3100583 could be distinguished owing to an earlier equilibrium state and a lesser chequered progress during simulation. The RMSD uctuation suggested that the MD trajectories except for CID12561537, have been overall stable for ligand-protein structures during simulation (Fig. 4a). It was noted that both protein backbone and CID3100583 ligand in the complex have been reached equilibrium in a shorter time (in approximately 200 frames) and remained stable except for negligible wobbles in a wider time span (Fig. 4b). Afterward, RMSF and B-factors were interpreted in order to detect atomic uctuations and x-ray scattering caused by thermal motion in speci c to per residue for C α atoms. Importantly, we apparently detected sharp declines in uctuation especially over three regions built by Val177, Arg188, Lys190, Ser204, Phe206, Asp207, Phe208, Arg209 and their adjacencies (Fig. 4c, orange line), demonstrating residues have rigidly docked to CID3100583 atoms and therefore been constrained less exible. Correspondingly, the regions in which B-factor was lower, account for regular secondary structures (Fig. 4c, blue line). For instance, Val177 and Arg188 accommodate in strands and Lys190 is included in helix while residues Ser204-Arg209 locates in a loop (see Supplementary Fig. S3). In other respects, RoG has been moderately altering over time in parallel with the RMSD chart. As plotted in Fig. 4d (blue line) during the initial 200 frames, here detected an unsteadiness in which also corresponded to the protein-ligand equilibrium phase (Fig. 4b). However beyond this point, RoG decreased upon probably binding with CID3100583, in concordance with the equilibrium of the system ( Fig. 4b and 4d). Fraction of native contacts (Qx) increases based upon more binding residues by determining transition in folding state, that is, the more the folding (lesser RoG), the higher the Q(x) is assumed (Best et al., 2013;Brylinski & Skolnick, 2008). Notably, Q(x) has been conducted verifying RoG data (Fig. 4d).
Despite the fact that Ebis-Theophylline, as the most probable repositioning drug (Table 1), showed the closest RMSD pro le among the annotated drugs, Ebis-Thophylline-MBD2 MBD system could catch the equilibrium a bit later and also uctuated much more during simulation trajectory compared to CID3100583-MBD2 MBD system (Fig. 5a). Average RMSD along MD trajectory was 0.89 Å, 0.90 Å, 0.98 Å, 0.94 Å, and 1.05 Å for CID3100583, Ebis-Theophylline, Golgicide A Sanguinarine, and GHMCytosine respectively. It's been noticed that Golgicide A-MBD2 MBD complex trajected more stably although it was the fth rank in the sense of docking (Table 1) and reached steady state at the latest, ashing as another alternative repurposing drug candidate (Fig. 5a). In more detail, both protein backbone and Ebis-Theophylline in the complex could get the equilibrium by the 450 frames but later on, they were seemed to considerably t in parallel until the end of time course (Fig. 5b). Similarly, atomic uctuations and B-factor was diminished in line with interacted residues such as Asp176, Val177, Arg188, Lys190, Asp207, Phe208, Arg209 (Fig. 5c). In case of RoG and Q(x), it could be estimated that there has been a transition in folding state at around 300 frames and continued with almost none of remarkable alteration in contrast to CID3100583 system, implying Ebis-Theophylline could detain receptor in more stable as soon as it has interacted with MBD2 MBD (Fig. 5d). In conclusion, dynamic behaviors and exibilities of crystal structure of both MBD2 MBD -CID3100583 and MBD2 MBD -Ebis-Theophylline complexes during MD trajectory have been consistent, proving strict interactions constructed in computational docking approach.

Calculation of MM-PB(GB)SA Binding Free Energy and E Dec Per Residue
After MD simulation, we employed MM-PBSA calculation and per-residue energy decomposition, aiming to elucidate putative binding free energy (ΔG Bind ) between MBD2 MBD and CID3100583 or Ebis-Theophylline, and the energetic contribution of each amino acid involved in ligand docking, respectively. As expected, compound-ligand complexes with lower docking scores and un ne uctuations established relatively lower binding free energy. As expected, 8,8-Ethylenebistheophylline had the strongest binding a nity (-15.03 kcal/mol) among annotated library and second rank within all ligands. However, CID3100583 was ranked as the third (-12.26 kcal/mol) in the list, and interestingly CID126001284 (-18.23 kcal/mol) and CID136748749 (-13.88 kcal/mol) demonstrated the strongest binding a nity (see Supplementary Table S2). In MBD2 MBD -CID3100583 trajectory, van der Waals interactions (-34.49 kcal/mol) and polar solvation energy (-26.10 kcal/mol) mainly contributed to ligand binding while nonpolar solvation energy (12.26 kcal/mol) conduced to unfavorable contributions (see Supplementary Table  S2). From this point of view, we concluded van der Waals forces and polar interactions should have an indispensable role in CID3100583 docking on MBD2 MBD .
We next examined the contribution per residue to binding free energy. Amino acids with energy decompose (E Dec ) contribution ranked in top ten, were Asp207, Arg209, Phe208, Val177, Phe206, Lys190, Asp176, Ser205, Tyr178, Thr210 in the protein-ligand system (Fig. 6a, top panel). It should be noticed that top energetic contributors to binding free energy have consistently been amino acids that were diversely interacted with CID3100583 within the binding pocket (Fig. 2c). As illustrated in Fig. 6a (bottom panels), the other residues except the top ten contributors had a less impact on binding free energy but certain neighboring residues around top ten could faintly contribute, as well (Fig. 6a, bottom panels). Herein, van der Waals forces have been detected as the main E Dec contribution however electrostatic energetics have constitutively been provided by Val177, Arg188, Ser204, Asp207 and Phe208 (Fig. 6a, bottom). In other respects, Ebis-Theophylline showed the top van der Waals (-37.55 kcal/mol) and polar solvation (-28.21 kcal/mol) contributions among all (see Supplementary Table S2). Accordingly, Arg188, Arg209, Phe208, Tyr178, Asp176, Val177, Val164, Asp207, Lys186, Lys190 were orderly top ten energetic contributors resulted from MD simulation (Fig. 6b, top panel), demonstrating a harmony with the amino acids achieved in docking analysis (Fig. 2d-e). As plotted in Fig. 6b (bottom) Asp176, Val177, Tyr178, Arg188, Phe208, and Arg209 have been essential for van der Waals contribution and also principle contributors to overall binding free energy. Though Val177, Tyr178, and Lys186 provided electrostatic forces (Fig. 6b), such forces have been accounted weakly in total energy decompose (see Supplementary Table S2). Thus, these results demonstrated docking might have been enabled by favor of such energetics contributed by especially amino acids enclosing CID3100583 or 8,8-Ethylenebistheophylline binding pockets.

Discussion
Since MBD2 is a miscellaneous participant in various biological phenomena and disorders, that links epigenetic methylation marks and gene expression regulation, it has arisen as a potent target for multipurpose. In the current study, we aimed to inspect novel drug candidates targeting MBD2 MBD through docking-based virtual screening and drug repurposing approaches. In this respect, we generated a comprehensive library including over 15,000 non-annotated compounds and biologically identi ed drugs, from ZINC15 database. Certain ligands have been listed in Supplementary Table S1 as hitter ligands by ltering according to resulted binding a nity and physicochemical properties such as complexity, molecular weight, tPSA, and also convenience to be purchasable or synthesizable for de facto attempts. Here, we have realized that numerous hit compounds (see Supplementary Table S1) and also unlisted ones consist of remarkable groups such as pyridine, quinoxaline, imidazole, isoquinoline, which are incorporated either in nucleosides and nucleotide analogs, or at least related structures to directly nucleotides or DNA-related constructs (Ali et al., 2008;García et al., 2008;Röthlisberger et al., 2017). For instance, quinoxaline and isoquinoline derivatives have been demonstrated to bind and act as antagonists of cyclic nucleotide-related enzymes such as cyclic nucleotide phosphodiesterase (Parra et al., 2001) and P2X7 nucleotide receptor (Humphreys et al., 1998;Watano et al., 2002), suggesting these groups might be attractive for DNA-or nucleotide-recognizing domains. Interestingly, NF449, an inhibitor for P2X1 and P2X2 receptors, was shown to inhibit MBD2 MBD interaction with methylated DNA, suggesting that binding with negative DNA backbone could be disrupted due to interaction between negatively charged sulfonic acids on NF449 and positively charged basic residues on the MBD2 surface (Wyhs et al., 2014). Likewise, we also found that ring structures in our ligands such as benzene, thiazole, quinone, pyridine, have already been built in NF449, Mitoxantrone, Idarubicin, Aurintricarboxylic Acid, KCC-07 (Wyhs et al., 2014;D. Zhu et al., 2018). Furthermore, some of screened chemicals were directly purine-or pyrimidine-containing molecules (ex. CID3100583, CID136707272, CID135623722, CID124903520, CID136732748, TB6:RiboPyrimidoP) or straightly nucleotide/nucleoside derivatives (ex. Regadenoson, Wyosine, 8-Methylamino-Guanosine, Thiosangivamycin, Adenosine 5'monophosphoramide, 5,6-Trimethyleneuridine, Fludarabine Base, 8-Aminoguanosine, Tubercidin, Hypoxanthosine, GHMcytosine, 6-Mercapto-7-Methylguanosine, 8-Chloroguanosine, Isovaleriansaures Coffein, 2-Methylformycin, 5'-N-Methylcarboxamidoadenosine, etc.), documented in Supplementary Table  S1. Additionally, these compounds highly contain methyl groups. Indeed, it has been quite interesting that a considerable number of particularly adenosine and guanosine derivatives have become prominent as a consequence of drug screening. It was reported that residues in MBDs of both MeCP2 and MBD2 exhibited binding a nity to mCAT, mCAC, mCAT, mCGG patterns, and surprisingly unmethylated TGdinucleotides (Fraga et al., 2003;K. Liu et al., 2018). Thereby, such interplay preferences of MBD might re ect why these compounds were recorded as hit molecules amongst the whole library. Then top ten of non-annotated and top ve annotated molecules were proceeded for further docking analysis. Principally, CID3100583 (ZINC3109386) and 8,8-Ethylenebistheophylline (ZINC8612354) urged us for the exhaustive investigation due to the highest binding energy and the lowest K i (Table 1). At this point, chemical contexts of both substances have been intriguing since CID3100583 contains two pyrimidine-based rings, and Ebis-Theophylline is composed of two interconnected druggable Theophylline (dimethylxanthine) molecules, a methylated xanthine analog which antagonizes human A1, A2a, A2b Adenosine Receptors as in the treatment of airways diseases and vasodilation (Cheng et al., 2017;Fieger & Wong, 2010;Landells et al., 2000). Together, similarities between the actual recognition target of MBD2 MBD (various nucleotide patterns above) and top hitters in the context of chemical and structural features have been speculated as possible reasons underlying such complementarity and ne docking.
Actually, all ligands have been demonstrated interacting with similar residues (Fig. 1). Particularly, hydrogen bonds, van der Waals forces, pi-Alkyl, and pi-Anion interactions between Val177, Arg188, Ser189, Lys190, Phe208, and Arg209 were the most prominent common residues, signifying both ligands have been stabilized by cementing highly reactive surroundings. It was previously reviewed that arginine, lysine, serine, threonine was orderly top interacting amino acids in protein-DNA complexes and arginine was featured prominently through hydrogen bonding/cation-π interactions with pyrimidine-guanine dinucleotides in DNA backbone (Luscombe et al., 2001;Zou et al., 2012). As shown, such kinds of amino acids were also favorably frequent in our analysis. In this respect, Liu et al. also reported MBD2 bound to DNA via constitution of H-bonding between Arg166, Asp176, Tyr178, Arg188 within MBD2 MBD and methylated-Cytosine (mC) along with adjacent Thymine or Guanine bases in the context of mCGG, mCAG, mCAT, mCC, mCT di-/tri-nucleotide patterns (K. Liu et al., 2018). Along with pi-Anion interactions between referred amino acids and aromatic bases above, alternate residues such as Lys167, Ser168, Ser171, Lys186, Ser189, Lys190, Arg209, interacted with the sugar-phosphate backbone of DNA.  in the vicinity to DNA duplex (Buchmuller et al., 2020). In other studies, Lys174 and Tyr178 have been reported as critical amino acids to selectively recognize and bind methylated CpG probes (Cramer et al., 2014;Fraga et al., 2003). Here, we demonstrated our compounds rmly settled in this functional mCpG-binding/recognition sphere. Supplementary Fig. S4, CID3100583 and Ebis-Theophylline also tted into MeCP2 MBD and MBD1 MBD , supporting these inhibitor candidates have potentially mimicked related nucleotide patterns and favorably invaded conserved methylated DNA recognition residues on MBDs. Importantly, Arg188 (MeCP2; R133, MBD1; R44), Ser189 (MeCP2; S134, MBD1; S45) and Phe208 (MeCP2; F157, MBD1; F64) were the most outstanding amino acids between family members , participated in H-bonding or van der Waals interactions between CID3100583, Ebis-Theophylline. However, we detected slight declines in calculated binding energies of MeCP2-/MBD1-ligand complexes, suggesting a nity and inhibition potential of both compounds have remained more speci c to MBD2 MBD in silico. These data collectively indicate that docking territories of the compounds were not indiscriminate due to chemical and structural determinants in inhibitor candidates shared between DNA. As another comparison, CID3136570 and CID3100583/Ebis-Theophylline could target the similar sub eld within MBD2 MBD however KCC-07binding residues were more comparable with other compounds such as CID46959745, CID126001284, CID125615373 (see Supplementary Fig. S5). These overlapping regions between previously patented inhibitors and CID3100583/Ebis-Theophylline compatibly appreciate the value and expected potential of our outcome in a prospective real assay.

As shown in
Moreover, we have sought out presumptive druggable pockets on the surface or interior of MBD2 MBD in order to conclude whether ligand-docking sites represented any signi cance in this sense (see Supplementary Information). In silico prediction yielded in a convergence between putative druggable pockets and ligand-anchored pockets (see Supplementary Fig. S6), proposing these ligands have also opted for ideal physicochemical districts on the protein as possible.
Ohki et al. de ned a small portion between positions Arg22-Arg30 in human MBD1 (conservatively corresponds to Arg166-Lys174 in human MBD2) as a exible loop (L1), demonstrating major conformational rearrangements upon DNA binding. Another loop (L2) between residues 44-48 (Lys186-190 in MBD2) was inserted into major groove although the rest of DNA-bound MBD had a stationary architecture in MBD1 and MeCP2 according to unbound state, clearly indicating an induced t model in vitro (Ohki et al., 2001). Similar loops with conserved residues and a congener DNA-bound orientation were well-structured for chicken MBD2 MBD by Scarsdale and colleagues, as well (Scarsdale et al., 2011).
Since we utilized an IFD to further insight to possible structural shifts in especially active binding site in MBD2 MBD . Consistently, we depicted moderate structural changes in side-chains of Arg166, Ser171, Lys174, Arg188, Ser189, Lys190, all compatible with L1 and L2 regions, and also in their adjacent residues such as Asp176, Val177, Tyr178 upon CID3100583 and Ebis-Theophylline docking ( Fig. 3a and  3b, also see Supplementary Movies S1-S2). Hereby, we putatively deduce that CID3100583 and Ebis-Theophylline are not only in contact with crucial amino acids for mCpG recognition, but also may trigger conformational changes in the active site in order to t better likewise mimicking DNA attachment in fact. Brie y, these data satisfactorily highlight the fact that ligands achieved in our study, especially CID3100583 and 8,8-Ethylenebistheophylline, could actually interfere methylated DNA-binding function of MBD2 by selectively invading of indispensable residues (see Supplementary Fig. S7).
Next, we simulated each ligand-protein complex to test its structural dynamics and stability in a time course. As plotted in Fig. 4a and 5a, CID3100583, CID343482, and Ebis-Theophylline established more stable dimensionality effects or conformational distributions over time, corresponding docking a nities (Table 1). Herein, both ligands (CID3100583 and Ebis-Theophylline) along with protein backbone, showed amenable uctuations and equilibration timing concurrently with MBD2 MBD receptor ( Fig. 4b and 5b).
Surprisingly, even though amino acids between Ser204 and Arg209 have been involved within an irregular loop region (see Supplementary Fig. S3), they possessed lower B-factor values ( Fig. 4c and 5c). The magnitude of that reduction may presumably be due to immobilization of the loop region by extensive interactions with either CID3100583 or Ebis-Theophylline over there. As a matter of fact that interactions with CID3100583 were potently concentrated on residues between Ser204 and Arg209, hence ensuring a further decline even rather than strands or helices and corresponding zone in MBD2 MBD -Ebis-Theophylline complex (Fig. 4c and 5c). Of note, RoG has moderately diminished at around frames 250 and 400 for CID3100583 and Ebis-Theophylline ( Fig. 4d and 5d), respectively, postulating a compaction in both protein-ligand systems (Best et al., 2013;Brylinski & Skolnick, 2008). Within this context, spectroscopic analysis revealed that upon methylated CpG binding, dynamic mobility and conformation of the complex have been shifted under the favor of DNA bending by MBD2 MBD (Pan et al., 2017).
Similarly, Liu et al. reported compaction in methylated DNA-bound MeCP2 MBD detected by atomic force microscopy in parallel with a decrease in RoG correlation (M. Liu et al., 2020). Furtherly, it was experimentally gured out DNA-contacting regions including Arg46 (Arg188 in human), Ser47 (Ser189 in human) Arg67 (Arg209 in human) in chicken MBD2 MBD was well-structured (Scarsdale et al., 2011), which is consistent with the non-uctuated residues detected in our MD simulation ( Fig. 4c and 5c). This may point to the hypothesis that reduction in RoG and a parallel increment in Q(x) could be resulted from further compaction and folding due to a con guring action of docked CID3100583 or Ebis-Theophylline by mimicking an ordinary MBD-DNA binding. In correlation, acquired energetics data also corroborated our docking results and conclusions so far. Both CID3100583 and Ebis-Theophylline have been ranked amongst the top ligands with the highest binding free energy (see Supplementary Table S2). Energy decomposition for CID3100583 showed Asp207, Arg209, Phe208, and Phe206 appeared in the top ve in terms of contribution to overall ΔG Bind (Fig. 6a), supporting rigidity in the loop region discussed above.
Besides, per residue energy decomposition (Fig. 6) and interactions in docking analysis (Fig. 2) established a ne correspondence for both ligands. For instance, Phe208, one of the top van der Waals energy-decomposing residues for CID3100583 was also constituted in van der Waals interaction as a result of docking analysis while Val177 incorporated with Ebis-Theophylline by electrostatic forces via both approaches (Fig. 6b).
Consequently, the presented computational study has established prosperous and crosschecking evidence for certain ligands in a comprehensive manner. CID3100583 and 8,8-Ethylenebistheophylline have successfully been invented as novel and most likely MBD2 inhibitors in the light of effective binding a nity, and negligible off-target probability predicted by ADMET analysis (Xiong et al., 2021) (see Supplementary Table S3). Though in silico approach was utilized within the scope of this paper, we have provided a consolidated basis for the next experimental studies testing their actual druggability. In conclusion, we rationally propose that especially CID3100583 and 8,8-Ethylenebistheophylline are highly versatile with the capability to be notably used as anti-cancer drugs and agents in iPSC reprogramming cocktails by impeding MBD2 function in related pathways. Total 1293 tranches in ZINC15 database (Sterling & Irwin, 2015) were ltered by molecule charge (between 0 and +2), molecular weight (MW, between 200 and 400 Da), and logP value (between -1 and +4.5) and thereby over 15,000 ligands could be extracted in .sdf format. Afterwards, ligand structure les were converted into .pdbqt le format via Open Babel toolbox (O'Boyle et al., 2011) for the further docking analysis.

High-throughput Virtual Screening
Structure-based virtual screening was accomplished by Vina wizard in PyRx program (Dallakyan & Olson, 2015) to discover possible binders for the related domain of MBD2 after ligand libraries including 15,893 compounds and MBD2 MBD (7AO8; C149-C212) was pre-processed. Grid box was adjusted with centering coordinates 121.02, 178.69, 185.85 and sizes with 38.57 Å x 35.46 Å x 31.60 Å for X-, Y-, Zdimensions, respectively, in order to involve the entire structure. Before starting screening, all compounds in the library were energy minimized. Results were sorted in descending according to a nity and the top 10 compounds with the best scores were evaluated for in silico docking approach.
In Silico Molecular Docking Molecular docking analysis was performed by AutoDock 4.2 (version 1.5.6), a widely used automated program for the prediction of molecular interactions between protein and ligand structure, based on exible-ligand/rigid-receptor model (Morris et al., 2009). Prior to docking run, all protein models (PDB: 7AO8, Chain C for MBD2 MBD ; 6OGJ for MeCP2 MBD ; 6D1T for MBD1 MBD ) and each ligand were structured and prepared using the mentioned software. Accordingly, removal of any other existed ligands/ions/heteroatoms and water molecules to obtain a crude protein structure, addition of polar hydrogens and missing atoms, calculation of Kollman charges, assignment of search parameter (Genetic Algorithm) and output format (Lamarckian GA) were all implemented as previously reported (Çalışkaner, 2021). As distinct from, we have played 150 runs (conformations) with a population size of 300 as rigid docking for each ligand-receptor complex. Besides, grid box was adjusted in the same coordinates and sizes as used in virtual screening with 0.375 Å spacing for blind docking since any possible binding residues in Methyl-DNA Binding Domain (MBD) of human MBD2 (MBD2 MBD ) were not missed out. The rst ranked conformation with the best binding energy and inhibition constant (Pantsar & Poso, 2018) was proceeded for further examination.
Induced-Fit-Docking (IFD) was performed by using AutoDockFR, a new docking engine based on the AutoDock4 scoring function (Ravindranath et al., 2015). Protein and each ligand were prepared according to the program instructions and stored with .pdbqt extension. All ligand interacting amino acids from prior AutoDock docking (rigid receptor; exible ligand) were selected as active site residues. After a nity maps were generated for each atom type, IFD ( exible-ligand; exible-receptor) was run by scripts presented by the developer.

Mapping Molecular Interactions and Distances
Intermolecular interactions such as hydrogen bonds (H-bonds), hydrophobic interactions, salt bridges, halogen bonds, etc. in the best docked ligand-protein pose with calculated distances, were extracted and visually rendered using PLIP-Protein Ligand Interaction Pro ler tool (Salentin et al., 2015)

Molecular Dynamics (MD) Simulation
Dynamic behaviors of the docked protein-ligand complexes were computed through VMD ver1.9.4a51 software (Humphrey et al., 1996), a parallel molecular dynamics code NAMD (Phillips et al., 2020), and CHARMM22 force eld (MacKerell et al., 1998). The parameters and protein topology les were generated automatic PSF builder inside VMD while ligand structures were utilized using CHARMM-GUI Ligand Reader and Modeler tool (S. Kim et al., 2017). The systems were solvated with an octahedron box of TIP3P water environment (Price & Brooks, 2004) extended 10 Å in each direction and Na + /Clcounterions were added to neutralize the system. Upon energy minimization in 3000 steps, the whole system was heated to 300K, in 30 ps, at 1 atm pressure under periodic boundary conditions to avoid edge effects.
Finally, a production run of MD simulation was carried out for a 20-ns duration. Flexibility and dynamics of each trajectory were evaluated by the query of Root Mean Square Deviation (RMSD), Root Mean Square Fluctuation (RMSF), Radius of Gyration (RoG), B-factor (Debye-Waller factor), fraction of native contacts (Qx), and estimated MM-PB(GB)SA binding free energy (deltaPB/ deltaGB). Resulted values were plotted and rendered by GraphPad Prism 6 graphic and statistics software.

Calculation of MM-PB(GB)SA Binding Free Energy and Energy Decomposition
The Poisson-Boltzmann or generalized Born and surface area continuum solvation (MM/PBSA and MM/GBSA) approaches are utilized in certain drug design strategies to estimate the free energy binding of small ligands to receptors (Genheden & Ryde, 2015;Yang et al., 2020). First, a total of 200 snapshots from simulated MD trajectory for 20 ns were extracted to explore MM-PB(GB)SA free energy and energy decomposition (E DEC ) per residue. Then, van der Waals force (ΔE VDW ), electrostatic interaction (ΔE ELE ), total (polar+nonpolar) solvation free energy (ΔE SOL ) contributions, and total estimated binding free energy (ΔG Bind ) were employed as post-processing approach, using two different web servers AMMIS (Wu et al., 2018) and farPPI (Wang et al., 2019), by following recommended tool instructions. -Certain residues are colored in association with druggable pockets. -Certain residues are colored in association with druggable pockets. Figure 1 The number of interactions contributed by common amino acids which have distributed within top ten non-annotated and ve annotated compounds from virtual screening.    Ethylenebistheophylline. Tiny squares on RMSF and B-factor curves, corresponds to residues between Asp176-Tyr178, Arg188, residues between Asp207-Arg209, respectively.

Figure 6
Detailed binding free energetics of both MD simulation output of MBD2 MBD -Ligand systems. Graphics of top ten residues (top panels) with the highest energy decomposition per residue for MBD2 MBD -CID3100583 (a) and MBD2 MBD -Ebis-Theophylline (b) complexes. Bottom panels (a, b) infer different energetic contributions of all residues within each system.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. RevisedSupplementaryInformation.pdf