Structure-based Discovery of Potential Inhibitors Targeting Sodium-bile Acid Co-transporter of Carcinogenic Liver Fluke Clonorchis Sinensis

Background: Clonorchis sinensis requires bile acid transporters as this uke inhabits bile juice-lled biliary ducts, which provide an extreme environment. C. sinensis sodium-bile acid co-transporter (CsSBAT) is indispensable for survival in the nal host, as it circulates taurocholate and prevents bile toxicity in the uke; hence, it is recognized as a useful drug target. Results: In the present study, using structure-based drug discovery approach, we presented inhibitor candidates targeting a bile acid-binding pocket of CsSBAT. CsSBAT models were built using comparative tertiary structure modeling based on a bile acid transporter template (PDB ID: 3zuy and 4n7x), and were applied into AutoDock Vina for competitive docking simulation. First, potential compounds were identied from PubChem (holding more than 100,000 compounds) by applying three criteria: i) interacting more favorably with CsSBAT than with a human homolog, ii) intimate interaction to the inward- and outward-facing conformational states, iii) binding with CsSBAT more preferably than natural bile acids. Second, two compounds were identied following Lipinski’s rule of ve. Third, another two compounds of molecular weight higher than 500 Da (Mr > 500 Da) were presumed to eciently block the transporter via a feasible rational screening strategy. These four inhibitor candidates exhibited least toxicity that may enhance drug-likeness properties. Conclusion: It is proposed that four compounds act as potential inhibitors toward CsSBAT, and further studies are warranted for drug development process against clonorchiasis.


Background
Clonorchis sinensis is a trematode parasite commonly observed in humans, and is transmitted by eating raw or undercooked freshwater sh contaminated with its metacercariae [1]. The intensity and duration of the infection determine the disease complications caused by this trematode. World Health Organization has classi ed C. sinensis as a Group 1 biological carcinogen inducing cholangiocarcinoma in humans [2].
Praziquantel has been strongly recommended for treating trematode infections in humans including clonorchiasis [3]. Due to the extensive use of praziquantel, certain trematodes in the tropical countries have developed low sensitivity to this drug [4,5]. Hence, new chemotherapeutic agents should be developed to circumvent the low sensitivity or drug resistance in these trematodes. To date, several drug candidates for trematodiasis have been tested on animals [6][7][8][9][10]. Among these, numerous drugs are under trials for clonorchiasis patients [11].
Structure-based drug discovery (SBDD) is a powerful approach for compound identi cation, and has recently gained immense attention in drug development [12]. Several studies have reported the effectiveness of computational approaches in discovering novel antiparasitic druggable compounds [13][14][15]. For C. sinensis, secretory phospholipase A 2 enzyme [16] and 20.6-kDa tegumental protein [17] were computationally evaluated as druggable targets; however, more compounds designed by SBDD are necessary for expanding drug candidate repertoire because they can be optionally tested according to the worm burden or developmental stages of C. sinensis.
Notably, bile acids trigger physiological stimuli; however, these metabolites can have detrimental effects on C. sinensis survival, as evidence suggests that the accumulated bile acids can cause toxicity to the worm's tissues and cells [18,19]. Recent studies reported that bile components can be toxic or detrimental to C. sinensis survival [10,20,21]. Under in vitro conditions, higher concentration (> 0.005%) of bile and lithocholic acid signi cantly decreased the worm survival [10,20,21]. Therefore, a defense system against the accumulation and toxicity of bile acids could be targeted to decrease C. sinensis survival.
Apical sodium-dependent bile acid transporter (ASBT) and Na + -taurocholate co-transporting polypeptide (NTCP) contribute to the enterohepatic circulation of bile salts in humans as homologs of C. sinensis sodium-bile acid cotransporter (CsSBAT) [22]. Since ASBT is a promising druggable target due to effectiveness in bile circulation, several inhibitors have been developed and evaluated [23][24][25]. Therefore, CsSBAT should be considered as a druggable target, whose taurocholate-binding pocket can be blocked and further inhibition can impede its transporter function.
In the present study, CsSBAT structures were reliably prepared using comparative tertiary structure modeling and re nements. For compound screening, the Lipinski's rule of ve and the rational virtual screening strategy were applied. Here, we identi ed putative inhibitory compounds, which competitively targeted the taurocholate-binding pocket of CsSBAT.

Methods
Tertiary structure modeling and re nement The full-length cDNA sequence of CsSBAT (Acc. No. KX756671) was retrieved from GenBank database [26]. To generate three-dimensional (3D) structure of CsSBAT, we compared the 3D structures built using different 3D modeling softwares such as Swiss-Model [27], IntFOLD [28], Phyre2 [29], RaptorX [30], HHpred [31], and I-TASSER [32]. The 3D models were then re ned in two steps. First, low free-energy conformations of the 3D structure were re ned by full-atomic simulations using either ModRe ner [33] or FG-MD [34]. Thereafter, backbone and side chains of the structure were re ned using GalaxyRe ne [35] using "both mild and aggressive relaxation" method based on repeated perturbation and overall conformational relaxation with short molecular dynamics simulations. Outward-facing (OF) and inward-facing (IF) conformations of CsSBAT were modeled using the templates Yersinia frederiksenii (YfASBT; PDB ID: 4n7x_A) [36] and Neisseria meningitides ASBT (NmASBT; PDB ID: 3zuy_A) [37], respectively. Quality validation and binding pockets of 3D models Potential errors in the 3D models were evaluated using PROCHECK [38], ProSA [39], and ERRAT [40]. Residue-by-residue stereochemical quality of 3D models was veri ed by Ramachandran plot [41] of PROCHECK. Overall quality score was analyzed by calculating atomic coordinates of the model using ProSA with a Z-score of experimentally determined structures deposited in PDB [42]. Statistics of nonbonded atom-atom interactions were validated in comparison to a database of reliable high-resolution crystallographic structures using ERRAT [40]. All structures and protein-compound interactions were visualized using UCSF Chimera v1.14 [43]. Disordered region was predicted using ESpritz [44]. Substratebinding sites in CsSBAT were predicted using COACH [45], referring to the analogs with similar binding sites.

Toxicity assessment
Toxicity risk and oral toxicity (LD 50 ) were predicted using ProTox [55]. Higher LD 50 dose led to less toxic compound. Toxicity class ranging 4-6 indicates that the compound is safe.

Results And Discussion
The best 3D model of CsSBAT was built from comparative homology modeling Both N-and C-terminal regions of CsSBAT were predicted to be hypothetical as well as disordered. In particular, disordered regions can result in long simulation time and may lead to errors in the structural clustering process [56]. Therefore, these regions were excluded from building 3D models (Additional le 1: Fig. S1). Functional region (residues 185-492) of CsSBAT matched well with experimentally characterized ASBTs such as IF-NmASBT (PDB ID: 3zuy_A) [37] and OF-YfASBT (PDB ID: 4n7x_A) [36].
To obtain reliable homologous models of parasites, combined approach of 3D modeling methods and re nement were employed [17][18][19][57][58][59]. All predicted 3D models of IF-CsSBAT were evaluated using homology modeling programs such as Swiss-Model [27], IntFOLD [28], Phyre2 [29], RaptorX [30], and HHpred [31], and threading-based modeling program such as I-TASSER [32] (Table 1). Swiss-Model, IntFOLD, and HHpred revealed values greater than 91.0% in the most favored region of Ramachandran plot [41]. Except for IntFOLD presenting erroneous ERRAT value, Swiss-Model, I-TASSER, and HHpred were evaluated further in terms of re nement because I-TASSER is e cient in building structure of unaligned regions by employing ab initio modeling [60]. Homology modeling of membrane proteins is challenging when the target protein shares low sequence identity (approximately 20%) with a template [61]. This issue was circumvented by re ning the initial models indicating poor quality [17,18]. Therefore, the initial models of IF-CsSBAT were re ned with ModRe ner, FG-MD, and GalaxyRe ne ( Table 2). After comprehensive evaluation, we applied Swiss-Model for reliable 3D modeling, and thereafter, FG-MD and GalaxyRe ne were used for effective re nement. In fact, Swiss-Model has been a powerful tool for transporter modeling [62,63]. Re ning enhanced the structural quality of the nal model compared to that of the initial model, particularly in terms of ERRAT values, which were increased from 46.0-97.7% (Tables 1 and 2, Additional les 1 and 2: Figs. S1 and S2); however, I-TASSER and HHpred could not overcome poor values either in most favored regions of Ramachandran plot or in unacceptable ERRAT plot. Moreover, 3D models of OF-CsSBAT, OF-HsASBT and IF-HsASBT were prepared as aforementioned. Bile acid-binding cavity and sodium-binding sites For the translocation of bile acids across the cell membrane, alternative conformational changes in IF and OF conformations of secondary active transporters were proposed [36,64]. One of the two conformational states was observed only in a particular state because it is di cult to crystallize the structure under other states. Structural information of one state was applied to predict another conformation of transporters [65].
Here, we predicted the OF-CsSBAT and IF-CsSBAT models based on OF-YfASBT [36] and IF-NmASBT [37], respectively. In CsSBAT, the bile acid-binding pocket was presumed to be formed with 9 residues in an extracellular cavity with a volume of 908 Å 3 in OF-CsSBAT ( Fig. 2A) and with 11 residues in an intracellular cavity with a volume of 986 Å 3 in IF-CsSBAT (Fig. 2B). Among the pocket forming residues, ve residues (Phe 196 , Phe 222 , Ala 288 , Ala 291 , and Met 295 ) participated in both conformations.
Recently, a putative third Na + -binding site was proposed, albeit rather speculative without experimental evidence [64]. The third Na + -binding site was reported from similar transporters such as glutamate [67] and leucine [68] transporters. In CsSBAT, residues Ile 280 , Gly 281 , Ser 283 , and Gln 445 were predicted to act as binding sites of the third Na + ion, which were superposed with corresponding residues on NmASBT ( Fig. 3D). Residue Gln 445 could be a key residue carrying Na + -ion from Na2 to Na3 site (Fig. 3). Mutation of Gln 258 in YfASBT (corresponding to Gln 445 in CsSBAT) was reported to signi cantly reduce the Na + -binding capacity of Na2 and Na3 sites [36]. Therefore, Glu 441 and Gln 445 in CsSBAT might act as molecular arms and transport Na + ion from one site to the next.
For accurate molecular docking of CsSBAT against compounds in the library, grid center and size were precisely speci ed in the extracellular and intracellular bile acid-binding pockets of OF-CsSBAT ( Fig. 2A) and IF-CsSBAT (Fig. 2B), respectively, rather than sodium-binding sites (Fig. 3B-D), as described in "Putative inhibitors screening" section of Materials and Methods.

Putative inhibitors targeting at bile acid-binding pocket
Structure-based virtual screening was used to select putative inhibitors of CsSBAT, which satis ed the following criteria (Fig. 1). i) A compound should interact more favorably with CsSBAT than with HsASBT to ensure accurate targeting. ii) OF-ASBT conformation should be considered as a target although IF-ASBT binding with taurocholate was used as a template for virtual docking [25], because the ASBTs transfer bile acid inward via conformational change from OF-to IF-conformation [69]. iii) The compound to be identi ed as a competitive inhibitor of taurocholate should reveal higher a nity than natural bile acids, ranging from − 6.2 to − 9.0 kcal/mol [25]. Theoretically, the binding energies of several bile acids against IF-CsSBAT and OF-CsSBAT conformations ranged from − 6.1 to − 8.7 kcal/mol (Table 3) [19]; however, those of HsASBT (CsSBAT homolog) were − 9.0 kcal/mol and − 9.2 kcal/mol against natural bile acids and poly(acrylic acid)-tetraDOCA conjugate (PATD; lead compound blocking HsASBT), respectively [25]. Thus, a cut-off value was set at − 9.2 kcal/mol since the present study aimed to explore the most probable inhibitor candidates binding to CsSBAT.  [46] were screened using MTiOpenScreen [47]. Of the top 1,000 scoring compounds under docking simulation against OF-and IF-conformations of CsSBAT and HsASBT, 19 compounds that could interact with only OF-CsSBAT or IF-CsSBAT were selected ( Fig. 4 and Additional le 3: Table S1). Of these, two compounds met our strict criteria. Compound 49734421 formed a hydrogen bond with Ala 291 of IF-CsSBAT and Asn 446 of OF-CsSBAT. Compound 124948115 formed two hydrogen bonds with OF-CsSBAT but not with IF-CsSBAT (Table 4 and Fig. 5). Majority of the residues of these two compounds were involved in hydrophobic interaction with residues on CsSBAT, implying that these interactions might play a crucial role in compound-protein interactivity. It has been reported that aromatic moieties with high hydrophobicity can enable bene cial interactions with nonpolar residues in the binding pocket [70]. , absorption or drug permeability is presumed to be more likely when there are less than 5 hydrogen bond donors, less than 10 hydrogen bond acceptors, a molecular weight with less than 500, and a calculated LogP smaller than 5. Recently, it was suggested that antiparasitic drugs should be exempted from this rule because several drug leads for infectious diseases do not follow Lipinski's rule of ve [46,71]. Less stringent criteria may allow to identify more lead compounds for further assays. The suggestion motivated us to nd out more effective strategy for antiparasitic drugs.
Large compounds satisfying high a nities PATD was recently synthesized and evaluated as a potent inhibitor against ASBT [25]. Molecular weight of PATD molecule is larger than 500 Da because it has several polyacrylic acids and tetraDOCAs. Surprisingly, PATD is a hydrophobic substrate, which violates ideal molecular weight of the Lipinski's rule of ve. Nonetheless, it successfully inhibits ASBT by lling up the bile acid-binding cavity [46]. This nding motivated us to screen compounds with molecular weight higher than 500 Da, which are assumed to tightly dock CsSBAT.
Compounds of high molecular weight (500-1,200 Da) were retrieved from PubChem compound library [49] and screened using AutoDock Vina v1.1.2 [50]. Of the 1,255 compounds, 49 compounds satis ed the three given criteria. By strictly applying the third criterion (higher a nity than natural bile acids), we selected 25 compounds with high a nity for CsSBAT, but low a nity for HsASBT (Additional le 4: Table S2). Five compounds presented toxicity values of 4-6 with LD 50 ranging 500-5,000 mg/kg (Table 5). Eventually, two compounds 45375808 ( Fig. 6A and B) and 9806452 ( Fig. 6C and D) were selected as possible candidate inhibitors for CsSBAT because these compounds could form more than two hydrogen bonds each with OF-CsSBAT and IF-CsSBAT. Notably, through docking simulation on compound-protein interactions, residues Glu 229 and Gly 287 participated in hydrogen bonding in taurocholate-IF-CsSBAT complex, whereas residues Gly 287 , Gln 345 , and Gln 348 participated in hydrogen bonding in taurocholate-OF-CsSBAT complex (Additional le 2: Fig. S2).
Gly 287 was involved both in taurocholate-IF-CsSBAT and taurocholate-OF-CsSBAT complexes. In majority of the compound-OF-CsSBAT complexes, this conserved residue Gly 287 was involved in hydrogen bond interaction (Fig. 6A, C, E, and G), and Gln 345 was involved in compound 92727-OF-CsSBAT complex formation (Fig. 6I). Among compound-IF-CsSBAT complexes, only compound 441243 formed hydrogen bond with Glu 229 (Fig. 6F). Compared to taurocholate-CsSBAT complexes, Ala 288 acted as a key residue for either hydrogen bonds or hydrophobic interactions in three compound-CsSBAT complexes (Fig. 6A-F).
Compounds 441243 and 3693566 presented 5 and 6 hydrogen bonds, respectively; however, these compounds were excluded owing to less hydrogen bonds in compound-IF-CsSBAT interactions, which could result in off-target binding with adverse drug reactions [72].
Compound 45375808, known as sofosbuvir, was proposed to inhibit nonstructural protein 5B polymerase in hepatitis C virus (HCV) [73]. Analogs of this compound exhibited inhibitory effect on HCV [74]. Compound 9806452 was reported as an inhibitor of matrix metalloproteinases [75] such as gelatinase A associated with tumor metastasis [76] and stromelysin-1 found in osteoarthritis [77]. Carboxyalkyl peptides containing a biphenylylethyl group inhibits adult Schistosoma mansoni [78]. Considering these reports, it is suggested that compounds 45375808 and 9806452 could be anthelminthic candidates for C. sinensis.

Conclusion
C. sinensis sodium-bile acid cotransporter (CsSBAT) should be indispensable for survival in the bile duct.
Inhibition of the bile transporters may perturb bile acid transport, and prove detrimental to C. sinensis. We identi ed the inhibitory compounds targeting the bile acid-binding pockets in CsSBAT, based on the physiological essence of crucial importance using structure-based drug discovery approach. First, PubChem compounds 49734421 and 124948115 were selected by applying the Lipinski's rule of ve.
Furthermore, we devised a feasible rational screening strategy to search inhibitor candidates with molecular weight greater than 500 Da. The large inhibitory compounds 45375808 and 9806452 were selected and expected to tightly bind to CsSBAT. These four inhibitor candidates revealed least toxicity that may enhance their druggability. Collectively, four compounds were proposed as putative inhibitors of CsSBAT, deserving further in vitro and in vivo evaluation toward anthelminthic development.  Figure 1 Strategy for structure-based virtual screening ASBTs presenting identical conformation are superposed. Tunnel leading to the TCH-binding site is visualized in yellow eclipse. Residues forming TCH-binding cavity are depicted as green spheres in CsSBAT.
Other residues participating in TCH-binding site are provided at each side. Na+ ions are in purple circle (Na1 and Na2). Taurocholate (TCH) is depicted as maroon pentagon Na+-and taurocholate-binding sites of CsSBAT. All ligands and interaction were predicted using COACH [45] except for Na3 site (A). Na+-binding sites (B, C), Na1 site (B), and Na2 site (C). Putative Na3 site (D) as suggested by Alhadeff et al. [64]. The consensus residues forming the binding site are presented in stick mode and labeled. Side-chain oxygen and hydrogen atoms are indicated in red and white, respectively. Na+ ion is depicted as a purple ball and taurocholate substrate as stick mode A Venn diagram presents compounds interacting with two SBATs and two ASBTs. Ellipse represents compounds screened against each of the four ASBTs. Gray area indicates compounds, which interact with either OF-CsSBAT or IF-CsSBAT. Black area depicts compounds that interact with both CsSBATs, but not with other ASBTs