All the in silico/computational study was done on KBS system with Intel Pentium Core i5- Processor, 7th generation, 12 GB RAM and 1 TB (hard disk). All of the ligand structures were drawn on Maestro 12.0, a predictive software of Schrodinger [21]. PHASE 3.0 module (Schrödinger Release 2021-4 n.d.) was used to performed pharmacophore and 3D-QSAR modelling and GLIDE module [23]was used for Docking study. GROMACS 18.4 an open source molecular dynamic (MD) [24] software was used for MD Simulation.
Dataset of 72 and 75 compounds were selected from women Breast (MCF-7) [25–29] and human lung (A-549) [30–35] cancer cell line inhibitors having identical basic nuclei of coumarin and biological activity in IC50 (µM). The structures were prepared in 3D conformation followed by energy minimization of ligands using force field OPLS_2005. The biological activity of each ligand was converted in negative logarithms of IC50 values (µM) to pIC50(µM) Table 1.
Table 1
showing QSAR sets actual activity and predicted activity against breast cancer cell
Ligand Name | QSAR Set | Actual Activity µM | Predicted Activity 1 µM | Predicted Activity 2 µM | Predicted Activity 3 µM | Predicted Activity 4 µM | Predicted Activity 5 µM |
1 | test | 6.31 | 5.5844 | 5.79919 | 5.87489 | 6.04983 | 5.90187 |
2 | training | 6.487 | 5.76023 | 6.12184 | 6.3408 | 6.80397 | 6.44003 |
3 | training | 5.555 | 5.51996 | 5.7201 | 5.79577 | 5.76496 | 5.14202 |
4 | training | 6.253 | 5.94353 | 6.1596 | 6.42026 | 6.2196 | 6.20259 |
5 | training | 6.256 | 5.82771 | 5.99926 | 6.25127 | 6.21686 | 6.28192 |
6 | training | 6.01 | 5.84743 | 6.01879 | 6.25655 | 6.05073 | 5.99684 |
7 | test | 6.496 | 5.56063 | 5.57965 | 5.75366 | 5.70082 | 5.70888 |
8 | test | 5.833 | 5.673 | 5.75853 | 5.95099 | 5.86873 | 5.86596 |
9 | training | 6.676 | 5.89365 | 6.10079 | 6.37686 | 6.37163 | 6.56518 |
10 | training | 6.061 | 6.06182 | 6.30008 | 6.54413 | 6.09426 | 5.95739 |
11 | test | 6.783 | 5.59537 | 5.62939 | 5.82481 | 5.78066 | 5.80118 |
12 | test | 6.907 | 5.68428 | 5.79257 | 6.00825 | 5.85856 | 5.84888 |
13 | training | 6.103 | 5.72107 | 5.8538 | 6.09319 | 6.05042 | 6.10813 |
14 | training | 6.175 | 5.96296 | 6.1874 | 6.44975 | 6.20244 | 6.1383 |
15 | test | 5.428 | 5.65956 | 5.75438 | 5.96241 | 5.93361 | 5.99649 |
16 | training | 6.38 | 5.97128 | 6.04601 | 6.22015 | 6.4111 | 6.63049 |
17 | training | 6.337 | 5.90495 | 5.99074 | 6.16504 | 6.32117 | 6.50738 |
18 | test | 5.879 | 5.90762 | 6.04987 | 6.18166 | 5.94229 | 5.95821 |
19 | training | 6.684 | 5.94875 | 6.12088 | 6.29448 | 6.26274 | 6.54899 |
20 | training | 5.503 | 5.94267 | 6.08887 | 6.20632 | 5.82632 | 5.71677 |
21 | training | 6.438 | 5.90968 | 6.08625 | 6.26486 | 6.14972 | 6.408 |
22 | training | 6.108 | 5.97011 | 6.14828 | 6.29882 | 6.09505 | 6.22207 |
23 | test | 6.24 | 6.03649 | 6.25021 | 6.43927 | 6.31826 | 6.45164 |
24 | training | 5.864 | 6.54512 | 6.21324 | 6.07449 | 6.16086 | 6.21624 |
25 | training | 6.592 | 6.5684 | 6.23909 | 6.10817 | 6.23677 | 6.36608 |
26 | training | 6.039 | 6.60758 | 6.26078 | 6.1609 | 6.13899 | 5.94315 |
27 | training | 6.11 | 6.58136 | 6.24758 | 6.11055 | 6.13273 | 6.11966 |
28 | training | 6.194 | 6.59505 | 6.23409 | 6.1336 | 6.16314 | 6.13033 |
29 | test | 6.111 | 6.39065 | 6.06674 | 5.94036 | 5.91041 | 5.9179 |
30 | training | 6.178 | 6.66893 | 6.3345 | 6.18184 | 6.23625 | 6.21633 |
31 | test | 6.147 | 6.53348 | 6.23057 | 6.10343 | 6.12724 | 6.06603 |
32 | test | 5.301 | 6.55433 | 6.22405 | 6.08802 | 6.19076 | 6.26805 |
33 | training | 6.216 | 6.63178 | 6.29882 | 6.21091 | 6.2249 | 6.0643 |
34 | training | 5.944 | 6.5748 | 6.21778 | 6.05109 | 6.03083 | 6.01588 |
35 | training | 6.015 | 6.62562 | 6.27936 | 6.1734 | 6.13996 | 5.9146 |
36 | training | 6.963 | 6.6451 | 6.41967 | 6.43763 | 6.87327 | 7.12247 |
37 | test | 4 | 4.70638 | 4.42757 | 4.48229 | 4.48135 | 4.57172 |
38 | training | 4.367 | 4.43207 | 4.10915 | 4.21651 | 4.23965 | 4.36485 |
39 | test | 4.495 | 4.43951 | 4.18657 | 4.24494 | 4.21096 | 4.04434 |
40 | training | 3.658 | 4.6295 | 4.32799 | 4.38727 | 4.17266 | 3.96599 |
41 | training | 4.125 | 4.57147 | 4.28619 | 4.34942 | 4.31737 | 4.28652 |
42 | training | 4.745 | 4.50961 | 4.20008 | 4.29959 | 4.29317 | 4.41886 |
43 | test | 3.456 | 4.77667 | 4.55714 | 4.69295 | 4.75453 | 4.90077 |
44 | training | 4.62 | 4.65233 | 4.36912 | 4.45143 | 4.44893 | 4.52728 |
45 | test | 5.167 | 4.4661 | 4.14024 | 4.17914 | 4.35446 | 4.32425 |
46 | training | 5 | 4.58932 | 4.34747 | 4.45569 | 4.8908 | 4.98628 |
47 | training | 5 | 4.57144 | 4.32904 | 4.43979 | 4.92455 | 5.10243 |
48 | training | 3.222 | 4.68562 | 4.36547 | 4.40823 | 4.19155 | 3.91416 |
49 | training | 4.602 | 4.66031 | 4.34042 | 4.38146 | 4.392 | 4.32855 |
50 | training | 5 | 4.73804 | 4.3354 | 4.33472 | 4.37375 | 4.46246 |
51 | training | 3.959 | 4.48188 | 4.1418 | 4.1547 | 4.18013 | 3.99695 |
52 | training | 3.824 | 4.38105 | 4.00496 | 4.00933 | 4.0266 | 3.85932 |
53 | training | 4.794 | 5.25431 | 5.27542 | 5.08102 | 4.68984 | 4.78205 |
54 | test | 4.787 | 5.3478 | 5.44144 | 5.27248 | 5.08405 | 5.12362 |
55 | training | 4.78 | 5.10205 | 5.26712 | 5.01929 | 4.70242 | 4.80732 |
56 | training | 5.068 | 5.24274 | 5.52676 | 5.32205 | 5.25332 | 5.20451 |
57 | test | 5.009 | 5.1912 | 5.42778 | 5.2173 | 5.08664 | 5.18102 |
58 | training | 5.631 | 5.32844 | 5.69654 | 5.56753 | 5.75656 | 5.71544 |
59 | training | 5.23 | 5.11467 | 5.30689 | 5.05681 | 4.8407 | 5.20739 |
60 | training | 5.7 | 5.31671 | 5.60922 | 5.46395 | 5.58773 | 5.77767 |
61 | training | 4.973 | 5.12978 | 5.23887 | 4.97872 | 4.59854 | 4.8141 |
62 | test | 4.906 | 5.20559 | 5.40661 | 5.17481 | 4.96481 | 5.01927 |
63 | test | 5.969 | 5.12412 | 5.22926 | 4.96954 | 4.57606 | 4.78921 |
64 | training | 5.174 | 5.23579 | 5.44384 | 5.22184 | 5.05497 | 5.13937 |
65 | training | 5.253 | 5.18071 | 5.50421 | 5.26845 | 5.15406 | 5.1673 |
66 | training | 5.34 | 5.32749 | 5.76992 | 5.585 | 5.71224 | 5.47501 |
67 | test | 6.065 | 5.1784 | 5.49353 | 5.25577 | 5.14544 | 5.19261 |
68 | training | 5.782 | 5.32855 | 5.75991 | 5.5858 | 5.75902 | 5.69701 |
69 | training | 5.358 | 5.21331 | 5.60199 | 5.40661 | 5.4451 | 5.39846 |
70 | training | 6.183 | 5.43285 | 5.96305 | 5.83512 | 6.23091 | 6.12554 |
71 | test | 6.186 | 5.24484 | 5.62242 | 5.43123 | 5.48066 | 5.45858 |
72 | training | 6.267 | 5.45365 | 5.9848 | 5.89205 | 6.37798 | 6.29328 |
Pharmacophore Model generation
PHASE 3.0 module of Maestro, was used for building of pharmacophore and 3D-QSAR model. It has six types of pharmacophoric features 1) A = H-bond acceptor, 2) D = H-bond donor, 3) H = hydrophobic, 4) N = negative ion, 5) P = positive ion, and 6) R = aromatic ring. Ligands with biological activity were prepared using LigPrep module of Maestro; to assign protonation states for pH 7.0 ± 2. Anti-breast cancer molecules were kept active which has 6.5 µM or less value and other kept as inactive (66 active and 6 inactive). In anti-lung cancer molecules threshold value of actives molecules were kept 5.5 µM in which 47 compounds came in active and 28 compounds came in inactive category. Common feature pharmacophore models were searched from a set of both women Breast (MCF-7) and human lung (A-549) cancer cell variants which were generated by an efficient variety of pharmacophoric sites in molecule/s. All the compounds of both cancer cell were matched and it was used to search and generate the best and common pharmacophore hypothesis based on the survival score. The survival score of common pharmacophoric hypotheses can be calculated by using Eq. 1. The common pharmacophores were generated using a tree-based partition algorithm with a maximum tree depth of 5 with an inter site distance of 2 Å.
Survival score = (Vector score) + (Site score) + (Volume score) + (Selectivity score) + (Number of actives that match the hypothesis − 1) − (Reference-ligand relative conformational energy) + (Reference-ligand activity) ………….(1)
The top-scored pharmacophoric hypothesis which AAARR_1 for both type of cancer was used to generate atom-based 3D-QSAR model. Ligand features are not considered in QSAR based on pharmacophore model. It requires whole molecular structure; so, atom-based QSAR model is accurate in explaining SAR (structure activity relationship). In this QSAR model, a ligand is considered as van der Waals spheres [36]. Developed model of pharmacophores (Breast and lung cancer) were divided into training set 70 % nd rest in test set using random selection. Compounds having anti breast cancer activity divided in 53 training and 19 test set. Compounds having anti-lung cancer activity divided in 53 training and 22 test set. The QSAR model were build using 5 PLS factor and 1Å grid spacing [37].
QSAR model validation
Leave-one-out (Q2, LOO) method is used to validate internal model i.e. training set molecules. In this methods, activity ‘Q2’ of training set molecules was predicted after the elimination of each molecules. Values of Q2 describes the stability of model. And it was calculated using following equation-
$${Q}^{2}=1-\frac{{\sum }_{i=1}^{n}\left(Xpred,{i}^{- }Xi\right)2}{{\sum }_{i=1}^{n}{\left(Xi-Xm\right)}^{2}}$$
2
…………………..
Where, X i = Actual activity
X pred,i = predicted activity
X m = average activity of all molecules in training set
Predicted activity of molecules in test set was calculated for external validation R2 of QSAR model. The pred_R2 value was calculated as follows [38].
$${pred\_R}^{2}=1-\frac{{\sum }_{i=1}^{n}\left(Xpred,{i}^{- }Xi\right)2}{{\sum }_{i=1}^{n}{\left(Xi-Xm\right)}^{2}}$$
3
………………..
Where, X i = Actual activity
X pred,i = predicted activity
X m = average activity of all molecules in test set
Phase Ligand Screening
Phase ligand screening is used to screen library of ligands with developed pharmacophoric hypothesis. The best hypothesis of both cancers was AAARR_1 used to screen out best 20 ligands with predictive activity from a library of 3400 coumarin derivatives.
Designing of novel coumarin derivatives and its validation
Developed 3D-atom-based QSAR model was used in design of novel coumarin derivatives for both of the cancer cells. The designed ligands were validated using docking and molecular dynamic (MD) simulation method.
Docking studies
Development of new blood vessels in tumour mass is known as angiogenesis, and it is identified as one of the most relevant malignancy hallmarks of cancer; specially it plays a important role in regulating breast tumours. VEGFR-2 (vascular endothelial growth factor-receptor) tyrosine kinase plays an important role in angiogenesis mainly in breast cancer therefore, protein VEGFR-2 (PDB ID: 4ASD) was taken for docking [34].
Protein kinase C iota (PKCι) is an oncogene which majorly participates in the initiation of cell division, conservation and progression of various types of cancer in human. In the lung, PKCι is essential for the care of changed phenotype of the 2 dominant types of NSCLC (non-small cell lung cancer); 1) LADC (lung adenocarcinoma), and 2) LSCC (lung squamous cell carcinoma). In extension, PKCι is essential for tumorigenesis by establishing and maintaining a highly aggressive stem-like, tumour-initiating cell phenotype. Therefore, the Protein Kinase Cι (PKCι) (PDB ID: 3A8W) target protein of lung cancer was taken [39]. Crystal structure of both proteins was retrieved from RCSB Protein Data bank and was prepared and minimized according Maurya AK, et al [37].
Glide module of Maestro was used to perform molecular docking with the breast and Lung cancer protein receptor binding site. Docking of both proteins were performed according Maurya AK, at al [[37, 40]]. Designed and phase screened ligands were used for molecular docking.
Molecular Dynamics Simulation
The ligands showing best docking score with protein VEGFR-2 and PKCι were used for further study of conformational flexibility and stability using GROMACS 18.4 [24]. CHARMM36 of 2021 [41] was used protein force field parameter and swissparam to generate ligand force field parameter [42]. After this, ligand-protein complex was solvated using TIP3P water model and then neutralized by addition of Cl−/Na+ ions. Then the complex system was energetically minimized and equilibrated using ad-hoc bash script. Further, NVT and NPT were generated at constant volume, pressure; 1Bar and temperature; 300K. After that both complex systems were subjected to MD run for the time of 100ns for each.