Enrichments of Ensemble Docking Strategy Based on the Bayesian Model

doi:10.21203/rs.3.rs-138297/v1

Download PDF

Research article

Enrichments of Ensemble Docking Strategy Based on the Bayesian Model

https://doi.org/10.21203/rs.3.rs-138297/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Motivation

Challenges remained in structure-based drug discovery which include protein flexibility in binding site. Thus, concerning the flexibility of proteins, docking into an ensemble of rigid conformations (ensemble docking) have been proposed with incorporation into protein flexibility with expects that it could provide higher enrichments than rigid single receptor. Here we have developed the ensemble docking strategy by using Bayesian Model algorithms, and this method is validated by three proteins: BTK, JAK and PARP. The Bayesian Model was used to integrate independent docking runs of an ensemble of rigid crystal structures and MD simulations.

Results

The structure of MD simulations outperforms the crystal structures in separating inhibitors from decoys in BTK and PARP. Further, the results demonstrated that the ensemble docking strategy has better performance than rigid single conformation.

Chemical Engineering

ensemble docking

Bayesian Model

correlation analysis

ROC curve

As a computational methodology, structure-based drug design (SBDD) was introduced in drug discovery in the 1970s[1]. This predictive technique is valuable in limiting project timeline and reduces the cost of drug discovery. A large quantity of protein structures of drug targets was acquired with the blossom of experimental knowledge in crystallography, and thus structure-based drug design were applied in many cases to discover potential active candidates including antivirals for HIV and influenza[2, 3]. However, challenges remain in the area, such as the protein flexibility in binding site, solvent treatment, and ligand electronic effects. Techniques that account for some degree of protein flexibility are expected to increase the docking accuracy in both binding mode and affinity prediction[4].

Protein flexibility could be accounted for through the simulation of receptor-ligand complex (induced-fit) or an ensemble of rigid receptor conformations[5, 6]. The first time that ensemble docking of small molecules was taken into account as a strategy to accommodate the receptor flexibility is conducted by multiple protein structures through exploring a number of conformational subsets of complex[7]. This subset of complex could be gathered from several ways including experimental data (X-ray crystallography and NMR spectroscopy)[8–10] and computational sampling (molecular dynamics simulations and homology models)[11, 12].

Ensemble docking involves docking compounds into multiple conformations of a target rather than one single protein structure[13]. Some ensemble docking methods employ multiple receptor conformations in a single docking run during the pose-generation phase[14]. Others perform a series of independent docking runs based on all members of the ensemble followed by cluster analysis to select representative conformations[15, 16]. The ensemble docking could yield higher enrichment of known inhibitor compared with single receptor conformation[17]. However, too many conformations used are unable to increase the accuracy. It is found that a 3–5 structures ensemble would be enough to yield satisfactory results more likely[9].

The key point is how to integrate the predictions of a series of independent docking runs based on multiple conformations for a specified target. A plausible way is merging the independent predictions into one entry and re-ranking or re-scoring. However, as the consensus scoring did, the strategy simply re-ranking and re-scoring using different docking algorithms is quite sensitive to each independent docking study[18]. An alternative way is the machine learning approaches that had been applied in virtual screening[19], which could integrate the predictions of different scoring functions for multiple conformations[20]. The Bayesian Model served as a support vector machine of machine learning algorithms has been a useful part of computer-aided drug discovery for many years and were popularized in Pipeline Pilot[21, 22]. In 2004, it is observed that the enrichment factor of the consensus scoring had been improved by using Bayesian Model to integrate the predictions from five scoring functions[23]. Besides, multiple receptors of ROCK1 had been combined as a single integrated model by using Bayesian Model, and the researchers thus found that the integrated model achieves the best prediction power[20, 24].

In this study, we employed the Bayesian Model for integrating the predictions of molecular docking of each ensemble, evaluating and validating the performance of each single protein structure and ensemble of protein structures. The procedures of this ensemble docking via multiple receptor conformations are summarized in Fig. 1.

Considering the pharmaceutical importance of different targets in drug design/discovery, we have selected the Bruton’s tyrosine kinase (BTK), Janus protein tyrosine kinases (JAK) and poly (ADP-ribose) polymerases (PARP) to develop an ensemble docking strategy. They are closely related to chronic immune diseases (rheumatoid arthritis and lupus)[25, 26], cytokines intracellular signaling pathway[27], and cancer [28–30], respectively. In this work, we have employed GOLD docking program to perform molecular docking based on each crystal structure and MD (molecular dynamics) snapshot. Finally, an ensemble docking strategy was built to integrate the independent docking runs of each ensemble based on the Bayesian Model.

2.1. Ligand Dataset

The active ligands with quantitative biological activities (Ki, Kd, and IC50) were extracted from the Binding database[31]. The number of inhibitors of each target protein is 125, 152, and 134 for BTK, JAK, and PARP, respectively. The validation dataset was prepared by mixing the known inhibitors with twenty times of diverse decoys which were randomly selected from the ZINC database[32]. Thus, the number of molecules for BTK, JAK and PARP in validation dataset is 2625, 3192, and 2814, respectively.

2.2. Protein Structures Preparation

The corresponding crystal ligand-bound structures of target proteins were retrieved from RCSB Protein Data Bank (PDB, https://www.rcsb.org/). There 40, 22, and 38 crystal-structure complexes for BTK, JAK, and PARP, respectively. The water and co-crystallized ligands of crystal structures were removed. Open-Source software PyMOL version 1.3 was employed to align structures of BTK, JAK, and PARP based on 4Z3V, 4EHZ and 6BHV, respectively.

2.3. Representative MD Simulations Generation

For each protein, the structure performing the best discrimination power was chosen as the initial structure of MD simulation. Representative MD simulations were determined from a trajectory using gromos algorithms[33] in GROMACS. The center of a cluster, the structure with the smallest average RMSD (root mean square deviation) from all other structures of the cluster, is determined as representative MD simulations. In detail, 2000 snapshots of the trajectory (20–40 ns) were extracted from the stable phase and RMSD matrix between every two protein conformations was built. The 2000 complexes were clustered into four categories when the RMSD cut off was set as 0.2 nm. For each category, the center of the cluster was selected as the representative structure.

2.4. Generation and Validation of Bayesian Models

Molecular docking calculation was performed based on each representative crystal structures and representative structures from MD simulations. The small-molecule docking scores of each single conformation and ensemble were used to generate the score-based Bayesian Models. The Bayesian Model of each ensemble was established by using the Create Bayesian Model module in DS (Discovery Studio) 4.0 based on the docking scores (details are reported in the Additional file 2).

To evaluate the prediction power of each Bayesian Model, 70% of the inhibitors and non-inhibitors in the validation dataset were randomly selected for the training set, and the remaining 30% compounds for the test set (see Additional file 1).

3.1. Structural Clustering

A total of 40, 22, and 38 crystal structures for BTK, JAK, and PARP were extracted. The crystal structures were clustered into four categories (Fig. 2). 6O8I (1.42 Å), 4RX5 (1.36 Å), 4RFZ (1.17 Å) and 5P9K (1.28 Å) were identified as the representative crystal structures of BTK with the highest resolution among each category (Additional file 2: Table S1), 6BBU (2.08 Å), 6N7A (1.33 Å), 6AAH (1.83 Å) and 4EHZ (2.17 Å) for JAK (Additional file 2: Table S2) and 4ZZZ (1.9 Å), 5WS1 (1.9 Å), 6I8T (2.1 Å) and 6NRH (1.5 Å) for PARP (Additional file 2: Table S3). The scatterplots of RMSD values for crystal structures versus residues were also calculated (Additional file 2: Figure S1).

3.2. Performance in Docking Calculations

We have previously studied that the compatible scoring functions by considering both reproducibility of ligand conformation and binding affinity prediction[34, 35]. The scoring function CHEMPLP was demonstrated to be appropriate in both docking power and scoring power and thus applied to scoring function (Additional file 2: Table S4).

The co-crystal ligand and prepared validation dataset was used to assess the docking power and discrimination power which presented by RMSD and P-value. The number of active compounds and decoys was listed in Table 1. Docking calculations were conducted by screening the validation dataset to discriminate the inhibitors from non-inhibitors. U-test (P-value) was used to evaluate the significance of the difference between the distributions of docking scores of the inhibitors and non-inhibitors.

The RMSD value below 2 Å is considered as a successful prediction of binding pose. Though, as shown in Table 2, in PARP the structure 6I8T is the unique that had succeed in pose prediction while four representative crystal structures of BTK and JAK have successfully reproduced the ligand conformation by re-docking the ligand as illustrated in Table 2 and Fig. 3. For BTK, each representative crystal structure also has The P-values suggested the significant difference of docking scores between inhibitors and decoys which means favorable discrimination power (Table 2, Fig. 4). The best P-values of three targets were 8.586E-74, 2.438E-67 and 6.260E-57 for BTK, JAK, and PARP, respectively. Apparently, the crystal structures of BTK performed best in distinguish the inhibitors and non-inhibitors. The docking performance of different target were different, even the different structures for the same target exhibited remarkable difference. On the basis of P-values, the representative complex 5P9K, 6N7A and 6I8T was chosen as the initial structure of BTK, JAK and PARP for MD simulations, respectively.

Table 1

Targets, actives and decoys used in this study
Target	BTK	JAK	PARP
actives	125	152	134
decoys	2500	3040	2680
EF^max	20	20	20

Table 2

The GOLD docking performance of representative crystal structures against BTK, JAK and PARP, respectively
Target	PDB ID	RMSD(Å)	P-Value
BTK	6O8I	1.6808	3.557E-71
	4RX5	0.8877	9.094E-70
	4RFZ	0.8824	2.954E-70
	5P9K	1.6393	8.586E-74
JAK	6BBU	0.7786	1.846E-52
	6N7A	0.3008	2.438E-67
	6AAH	0.5392	1.093E-56
	4EHZ	0.5708	3.234E-62
PARP	4ZZZ	2.0444	1.085E-32
	5WS1	2.3612	1.079E-32
	6I8T	0.9136	6.260E-57
	6NRH	2.3873	2.046E-48

P-value is used to quantify the significance level of the two-tailed asymptotic significance of the Man-Whitney U test. The PDB ID of the bold font is selected as the initial structures of molecule dynamics simulations.

3.3. Molecular Dynamics Simulations

The crystal structures 5P9K, 6N7A and 6I8T were chosen as the initial structure to construct the MD simulations. For each complex, 40 ns MD simulations were enough for trajectory to achieve equilibrium. The structural deviation between the structures in the dynamic simulation process and the initial structure were calculated based on the backbone. As shown in Fig. 5, the RMSD of backbone for BTK, JAK and PARP is under 0.3 nm which means the system has equilibrated in the molecule dynamic simulation process. To tolerate structural differences in the stable phase of each protein, 20–40 ns trajectory was collected. The RMSD matrix between each two conformations was constructed, based on which can perform the subsequent cluster analysis. On this basis, the 20–40 ns trajectory was clustered, and the conformation with middle RMSD obtained in the cluster can be used as a representative structure (Fig. 6). For BTK, the structures 5P9K_3768, 5P9K_2526, 5P9K_2992 and 5P9K_2629 were recovered as representative MD snapshots. While the structures 6N7A_3398, 6N7A_2014, 6N7A_3726 and 6N7A_2010 as well as 6I8T_3041, 6I8T_3906, 6I8T_2072 and 6I8T_3274 were recovered as representative MD simulations for JAK and PARP respectively.

3.4. The Correlation Analysis of Docking Performance Based on Crystal Structures and Structures from MD Simulations

Docking performance was evaluated by docking scores which were obtained from molecular docking simulation. The docking scores of the known inhibitors and non-inhibitors of each target based on each representative crystal structure and structure from MD simulations were used to execute the correlation analysis including Pearson and Spearman ranking correlation analysis. When absolute value of the correlation coefficient (r or ρ) is greater than or equal to 0.8, it is considered that the two variables are highly correlated. While when less than 0.3 indicates weak correlation, basically irrelevant. The redundant structures whose correlation coefficient is high (༞0.9), which means the compared two structures possess highly similar docking results, were discarded. Then the Pearson correlation coefficients (r) and the Spearman ranking correlation coefficients (ρ) of the docking scores based on each two compared protein structures were calculated (Fig. 7). In detail, the results were concluded in Additional file 2: Table S5 and Additional file 2: Table S6.

3.5. Performance of Bayesian Models Based on Each Single Structure and Ensemble of Protein Structures

To constitute ensembles with different structures in size and element for exploring how different number and member of ensemble influence the virtual screening performance, the structures 6O8I and 5P9K_2629 in BTK, 6N7A_2681 in JAK as well as 6I8T_3906 in PARP are discarded based on the correlation analysis and 5-fold cross validation result for Bayesian Model. We have constructed the ensembles by following the principle: validation result is preferred, and the smaller-sized ensemble is the basis of the bigger. It needs to be noted that in consideration of the differentiation between crystal structures and MD simulations in validation result, the ensembles in which all members are crystal structures or MD simulations are validated. Based on these, nine panels are identified: Crystal Structure, MD Simulation, Two-size Ensemble, Three-size Ensemble, Four-size Ensemble, Five-size Ensemble, Six-size Ensemble, Seven-size Ensemble and Eight-size Ensemble. For all targets, the members of all ensembles are listed in Additional file 2: Table S7. Bayesian Models are created based on the docking scores of each protein structure and ensemble, and validated by leave-one-out cross-validation. The detailed 5-Fold Cross-Validation Results are summarized in Additional file 2: Table S8 and an external test set was used to test the performance of the model (Additional file 2: Table S9). The Bayesian score was extracted from the validation result and summarized in Table 3. For each single receptor, the virtual screening performance of Bayesian Model was quantified by the area under curve (AUC) of its receiver operating characteristic (ROC) plot. Furthermore, Four-size ensemble: all crystal structures combined and all MD simulations combined; the ensemble with maximal model score whose ROC curves are plotted as well.

It is presented in Table 3, with Bayesian score as the evaluation criterion that the protein structures of BTK performed well in both training set and external test with all ROC scores above 0.9. In turn, the protein structures of JAK and PARP exhibited worse performance with some ROC scores below 0.9 and even there are ROC scores below 0.8. In JAK the crystal structures outperformed the MD simulations while the reverse performance observed in BTK and PARP. The same trend was discovered in all crystal structures combined (Ensemble 7) and all MD simulations combined (Ensemble 8) ensembles in JAK and PARP. Comparing the single rigid structures with ensembles, universally, the ensembles regardless of different size give better Bayesian score which means that the Bayesian Models based on ensembles give more satisfactory predictions in identifying inhibitors and non-inhibitors. The best performed Bayesian Models in all targets are Ensemble 10 (Five-size Ensemble), Ensemble 10 (Five-size Ensemble) and Ensemble 9 (Four-size Ensemble) for BTK, JAK and PARP, respectively. Further, it is obviously that the Models’ performance does not always increase with the number of member. The best performing model usually happens to be a Four-size or Five-size Ensemble.

The ROC curves were used to illustrate the virtual screening performance of Bayesian Models (Fig. 8), the corresponding AUC were displayed in Table 4. It is consistent with the Bayesian score; the MD simulations outperformed the crystal structures in BTK and PARP. However, the crystal structures combined ensemble (Ensemble 7) was presenting more outstanding performance than the MD simulations combined ensemble (Ensemble 8). Lower correlation of docking scores between crystal structures compared with MD simulations may account for this. When different structures used for the docking calculations, the top ranking compounds are quite different. This elucidates the importance of choosing the appropriate protein structure in docking calculations, and then in ensemble constructing.

Table 3

Bayesian Score of the Bayesian Models Based on the Docking Scores of Each Single Representative Complex and Ensemble for BTK, JAK and PARP
Panel	BTK			JAK			PARP
	Ensemble	ROC Score		Ensemble	ROC Score		Ensemble	ROC Score
	Ensemble	Training Set	Test Set	Ensemble	Training Set	Test Set	Ensemble	Training Set	Test Set
Crystal Structure	4RFZ	0.957	0.95	6BBU	0.862	0.856	4ZZZ	0.779	0.791
	6O8I	0.96	0.948	6N7A	0.906	0.883	5WS1	0.775	0.828
	4RX5	0.958	0.942	6AAH	0.87	0.854	6I8T	0.878	0.908
	5P9K	0.961	0.97	4EHZ	0.89	0.87	6NRH	0.843	0.865
MD Simulation	5P9K_3768	0.951	0.948	6N7A_3398	0.828	0.783	6I8T_3041	0.879	0.906
	5P9K_2526	0.967	0.952	6N7A_2014	0.864	0.796	6I8T_3906	0.853	0.92
	5P9K_2992	0.968	0.983	6N7A_2681	0.796	0.768	6I8T_2072	0.841	0.87
	5P9K_2629	0.963	0.941	6N7A_2010	0.855	0.834	6I8T_3274	0.867	0.934
Two-size Ensemble	Ensemble 1	0.975	0.973	Ensemble 1	0.913	0.894	Ensemble 1	0.905	0.925
	Ensemble 2	0.974	0.986	Ensemble 2	0.884	0.836	Ensemble 2	0.892	0.934
	Ensemble 3	0.978	0.992	Ensemble 3	0.912	0.887	Ensemble 3	0.912	0.938
Three-size Ensemble	Ensemble 4	0.979	0.982	Ensemble 4	0.916	0.896	Ensemble 4	0.897	0.923
	Ensemble 5	0.976	0.985	Ensemble 5	0.879	0.835	Ensemble 5	0.895	0.938
	Ensemble 6	0.981	0.991	Ensemble 6	0.917	0.885	Ensemble 6	0.918	0.939
Four-size Ensemble	Ensemble 7	0.981	0.98	Ensemble 7	0.926	0.911	Ensemble 7	0.893	0.92
	Ensemble 8	0.977	0.982	Ensemble 8	0.874	0.829	Ensemble 8	0.895	0.933
	Ensemble 9	0.981	0.981	Ensemble 9	0.919	0.899	Ensemble 9	0.919	0.948
Five-size Ensemble	Ensemble 10	0.983	0.988	Ensemble 10	0.926	0.912	Ensemble 10	0.916	0.949
Five-size Ensemble	Ensemble 11	0.982	0.989	Ensemble 11	0.924	0.9	Ensemble 11	0.914	0.942
Six-size Ensemble	Ensemble 12	0.983	0.987	Ensemble 12	0.924	0.901	Ensemble 12	0.914	0.946
Seven-size Ensemble	Ensemble 13	0.983	0.986	Ensemble 13	0.918	0.894	Ensemble 13	0.91	0.942
Eight-size Ensemble	Ensemble 14	0.982	0.985	Ensemble 14	0.914	0.888	Ensemble 14	0.906	0.939

Table 4

Virtual Screening Performance of Bayesian Models for BTK, JAK and PARP
Validation Result
BTK			JAK			PARP
Ensemble	AUC	Quality	Ensemble	AUC	Quality	Ensemble	AUC	Quality
4RFZ	0.916	Excellent	6BBU	0.827	Good	4ZZZ	0.751	Fair
6O8I	0.916	Excellent	6N7A	0.844	Good	5WS1	0.759	Fair
4RX5	0.91	Excellent	6AAH	0.836	Good	6I8T	0.859	Good
5P9K	0.93	Excellent	4EHZ	0.865	Good	6NRH	0.826	Good
5P9K_3768	0.923	Excellent	6N7A_3398	0.794	Fair	6I8T_3041	0.86	Good
5P9K_2526	0.924	Excellent	6N7A_2014	0.805	Good	6I8T_3906	0.83	Good
5P9K_2992	0.931	Excellent	6N7A_2681	0.749	Fair	6I8T_2072	0.816	Good
5P9K_2629	0.92	Excellent	6N7A_2010	0.817	Good	6I8T_3274	0.854	Good
Ensemble 7	0.974	Excellent	Ensemble 7	0.919	Excellent	Ensemble 7	0.905	Excellent
Ensemble 8	0.972	Excellent	Ensemble 8	0.858	Good	Ensemble 8	0.901	Excellent
Ensemble 10	0.979	Excellent	Ensemble 10	0.919	Excellent	Ensemble 9	0.922	Excellent

In this study, the ensemble docking was applied in BTK, JAK and PARP to retrieve known inhibitors from established ligand training set and test set and the performance were compared with standard single rigid receptor. By using Bayesian Model, the independent predictions of molecular docking of each ensemble were integrated. Additionally, the overall integrated predictions of molecular docking were represented by ROC curves to evaluate the enrichment of each single rigid receptor and ensemble. By applying this ensemble docking strategy in three targets, a total of 14 ensembles for BTK, JAK and PARP, were screened against the ligand dataset. The rigid crystal structures and MD snapshots were used in the same way to conduct the comparison between them. It is found that most of the ensembles are performing better than single rigid receptor in enrichment. Though, the ensembles in different size for three proteins suggest no absolute trend in enrichment of known inhibitors, in general, the enrichment is increasing with the ensemble size until 4–5 membered ensemble which also demonstrated by the academia before[9]. The larger size ensemble, such as six, seven or eight-size ensemble didn’t express dominant superiority in enrichment. It suggests that the ensemble docking strategy built in this work could increase the enrichment of known inhibitors efficiently in BTK, JAK and PARP. We expect that the ensemble docking strategy can serve as a useful tool in virtual screening to find more promising and diverse inhibitors when employing in more targets.

SBDD: Structure-based Drug Design

MD: Molecular dynamics

BTK: Bruton’s tyrosine kinase

JAK: Janus protein tyrosine kinases

PARP: Poly (ADP-ribose) polymerases

PDB: Protein Data Bank

RMSD: Root mean square deviation

DS: Discovery Studio

AUC: area under curve

ROC: Receiver operating characteristic

Availability of data and materials

The datasets supporting the conclusions of this article are included within the article and its additional file.

Competing interests

The authors declare that they have no competing interests.

Funding

This project was financially supported by the General Program of Applied Basic Research of Yunnan Province (Grant No: 202001AS070026), the State Key Laboratory of Phytochemistry and Plant Resources in West China (Grant No: P2017-KF07, P2018-KF14) and the Introduction Program of Scientific Researcher of Sichuan University of Science & Engineering (2020RC40)

Authors' contributions

Zhili Zuo and Yi Liu; design of study, Yu Lei was responsible for Bayesian Model development, implementation and validation. Sheng Guo performed statistical analysis. Yu Lei and Sheng Guo prepared the manuscript with the revision of Zhili Zuo. All authors read and approved the final manuscript.

Acknowledgements

Computational resources were supplied by Supercomputing Center of Sichuan University of Science & Engineering and Kunming Institute of Botany of Chinese Academy of Sciences.

Amaro R.E. et al (2018) Ensemble Docking in Drug Discovery. Biophys J 114(10):2271-2278
Kaldor. S.W. et al (1997) Viracept (Nelfinavir Mesylate, AG1343): A Potent, Orally Bioavailable Inhibitor of HIV-1 Protease. J Med Chem 40(24):3979–3985
Von Itzstein M. et al (1993) Rational design of potent sialidase-based inhibitors of influenza virus replication. Nature 363(6428):418-423
Nichols S.E. et al (2011) Predictive Power of Molecular Dynamics Receptor Structures in Virtual Screening. Journal of Chemical Information and Modeling 51(6):1439-1446
Kalid O. et al (2012) Consensus Induced Fit Docking (cIFD): methodology, validation, and application to the discovery of novel Crm1 inhibitors. Journal of Computer-Aided Molecular Design 26(11):1217-1228
Feixas F. et al (2014) Exploring the role of receptor flexibility in structure-based drug discovery. Biophysical Chemistry 186:31-45
Carlson. H.A., K.M. Masukawa.J.A. Mccammon. (1999) Method for Including the Dynamic Fluctuations of a Protein in Computer-Aided Drug Design. J Phys Chem A 103(49):10213–10219
Craig. I.R., J.W. Essex.K. Spiegel. (2010) Ensemble Docking into Multiple Crystallographically Derived Protein Structures: An Evaluation Based on the Statistical Analysis of Enrichments. J Chem Inf Model 50(4):511-524
Rueda M., G. BottegoniR. Abagyan (2010) Recipes for the Selection of Experimental Protein Conformations for Virtual Screening. Journal of Chemical Information and Modeling 50(1):186-193
Damm K.L.H.A. Carlson (2007) Exploring Experimental Sources of Multiple Protein Conformations in Structure-Based Drug Design. Journal of the American Chemical Society 129(26):8225-8235
Sherman W. et al (2006) Novel Procedure for Modeling Ligand/Receptor Induced Fit Effects. Journal of Medicinal Chemistry 49(2):534-553
Amaro R.E., R. BaronJ.A. Mccammon (2008) An improved relaxed complex scheme for receptor flexibility in computer-aided drug design. Journal of Computer-Aided Molecular Design 22(9):693-705
Leach A.R., B.K. ShoichetC.E. Peishoff (2006) Prediction of Protein—Ligand Interactions. Docking and Scoring: Successes and Gaps. ChemInform 37(50)
Totrov M.R. Abagyan (2008) Flexible ligand docking to multiple receptor conformations: a practical alternative. Current Opinion in Structural Biology 18(2):178-184
Rommie E.A.W.L. Wilfred (2010) Emerging Methods for Ensemble-Based Virtual Screening. Current Topics in Medicinal Chemistry 10(1):3-13
M. T.A. R. (2008) Flexible ligand docking to multiple receptor conformations: a practical alternative. Curr Opin Struct Biol 18(2):178-184
Osguthorpe D.J., W. ShermanA.T. Hagler (2012) Generation of Receptor Structural Ensembles for Virtual Screening Using Binding Site Shape Analysis and Clustering. Chemical Biology & Drug Design 80(2):182-193
Bottegoni G. et al (2011) Systematic Exploitation of Multiple Receptor Conformations for Virtual Ligand Screening. PLOS ONE 6(5):e18845
Sato T., T. HonmaS. Yokoyama (2010) Combining Machine Learning and Pharmacophore-Based Interaction Fingerprint for in Silico Screening. Journal of Chemical Information and Modeling 50(1):170-185
Tian S. et al (2013) Development and Evaluation of an Integrated Virtual Screening Strategy by Combining Molecular Docking and Pharmacophore Searching Based on Multiple Protein Structures. Journal of Chemical Information and Modeling 53(10):2743-2756
Chen B. et al (2012) Comparison of Random Forest and Pipeline Pilot Naïve Bayes in Prospective QSAR Predictions. Journal of Chemical Information and Modeling 52(3):792-803
Klon A.E., J.F. LowrieD.J. Diller (2006) Improved Naive Bayesian Modeling of Numerical Data for Absorption, Distribution, Metabolism and Excretion (ADME) Property Prediction. ChemInform 37(49)
Anthony E K. (2004) Combination of a naive Bayes classifier with consensus scoring improves enrichment of high-throughput docking results. %J Journal of medicinal chemistry. 18(47)
Sheng T. (2013) Modeling compound-target interaction network of traditional Chinese medicines for type II diabetes mellitus: insight for polypharmacology and drug design. %J Journal of chemical information and modeling. 7(53)
Puri K.D., J.A. Di PaoloM.R. Gold (2013) B-Cell Receptor Signaling Inhibitors for Treatment of Autoimmune Inflammatory Diseases and B-Cell Malignancies. International Reviews of Immunology 32(4):397-427
Di Paolo J.A. et al (2011) Specific Btk inhibition suppresses B cell– and myeloid cell–mediated arthritis. Nature Chemical Biology 7(1):41-50
O'shea J.J. et al (2004) A new modality for immunosuppression: targeting the JAK/STAT pathway. Nature Reviews Drug Discovery 3(7):555-564
O’shaughnessy J. et al (2011) Iniparib plus chemotherapy in metastatic triple-negative breast cancer. . N Engl J Med 364:205– 214
Jonathan. L. et al (2012) Olaparib Maintenance Therapy in Platinum-Sensitive Relapsed Ovarian Cancer. N Engl J Med 366(15):1382-1392
Novello S. et al (2014) A phase II randomized study evaluating the addition of iniparib to gemcitabine plus cisplatin as first-line therapy for metastatic non-small-cell lung cancer. Ann Oncol 25:2156– 2162
Gilson M.K. et al (2016) BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Research 44(D1):D1045-D1053
John J I. (2012) ZINC: a free tool to discover chemistry for biology. %J Journal of chemical information and modeling. 7(52)
Daura X. et al (1999) Peptide Folding: When Simulation Meets Experiment. Angewandte Chemie International Edition 38(1‐2):236-240
Li Y. et al (2014) Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J Chem Inf Model 54(6):1717-36
Su M. et al (2019) Comparative Assessment of Scoring Functions: The CASF-2016 Update. J Chem Inf Model 59(2):895-913

Download PDF

Version 1

posted

You are reading this latest preprint version

Enrichments of Ensemble Docking Strategy Based on the Bayesian Model

Status:

Version 1

Abstract

Figures

1. Introduction

2. Methods

2.1. Ligand Dataset

2.2. Protein Structures Preparation

2.3. Representative MD Simulations Generation

2.4. Generation and Validation of Bayesian Models

3. Results And Discussion

3.1. Structural Clustering

3.2. Performance in Docking Calculations

3.3. Molecular Dynamics Simulations

3.4. The Correlation Analysis of Docking Performance Based on Crystal Structures and Structures from MD Simulations

3.5. Performance of Bayesian Models Based on Each Single Structure and Ensemble of Protein Structures

4. Conclusions

Abbreviations

Declarations

References

Supplementary Files

Status:

Version 1