This study is an attempt to identify novel PRMT3 allosteric inhibitors based on the pharmacophore hypothesis generated from the known chemically derived inhibitor SGC707 employing 3D-QSAR model validations and high-throughput virtual screening using Selleckchem, Chemdiv and Lifechemical databases. The screened compounds were further validated as effective drug molecules based on ADMET prediction and the top lead compounds were taken over for further analysis using Molecular dynamics simulations to examine the binding stability of the protein–ligand complexes.
3.1 PRMT3 expression in CRC stages:
Inorder to identify the potent genes associated with the progression of CRC, we selected the dataset GSE41011 and centered on the candidates that were expressed in different stages of CRC. The stage information clinical data (I, II, III and IV) were collected with distant normal non-cancerous samples. First, we identified the differentially expressed genes for each clinical stage of CRC with its distant normal with significant stringent criteria, and subsequently, duplicate gene symbols were removed in each stage. Of these DEGs, 93 were identified in Stage I, 107 in Stage II, 71 in Stage III, and 308 in Stage IV with logFC > 2 and an adjusted P-value < 0.05. Further integrated analysis of DEGs using the Venn diagram (Fig. 2A) revealed that PRMT3 was consistently overlapped and significantly over-expressed in all the four stages of CRC (Fig. 2D). Figure 2B & 2D reveals the stage-wise relative expression of PRMT3. The findings showed that the rates of expression of PRMT3 are correlated significantly with the cancer tumor stage.
The expression of PRMT3 was assessed in a panel of human colorectal cancer samples which includes different clinical stages across various online Omics resources like GEPIA(http://gepia.cancer-pku.cn/), UALCAN (https://ualcan.path.uab.edu/) and cBioportal (https://www.cbioportal.org/) (Fig. 2C,F). The PRMT3 expression was significantly differentially up-regulated across all the tumor stages associated with the cancer cohorts of TCGA, GTex, etc. Our results indicate the heterogeneity of PRMT3 in the CRC progression and might constitute a potent candidate for therapeutic intervention [48].
3.2 Pharmacophore generation
The pharmacophore models were generated using Phase module of Schrödinger suite with the standard set of pharmacophore features employing the known inhibitor SGC707 with IC50 value 31nM as input, along with default parameters. In total, 60 common pharmacophore hypotheses were generated using different combinations of 20 variants. Among the 20 variants, five pharmacophore models DDHHR_1, DDHHR_3, DDHHR_4, DDHHR_5 and DDHHR_6 were selected for further analysis (Supp. Table 1). The scoring procedure provides a ranking of the different hypotheses, allowing us to make rational choices. The scoring algorithm includes the alignment of site points and vectors, volume overlap, number of ligands matched, selectivity, relative conformational energy, and activity [33]. A total of five hypotheses survived the scoring processes were utilized to build an atom-based QSAR model. Hypothesis DDHHR_1 was selected as the best hypothesis based on highest survival score of 5.287 and all other parameters were similar among all models generated (Fig. 3).
3.3 Analysis of 3D-QSAR:
For the accurate prediction of the pharmacophore model, 3D-QSAR model evaluations were executed for the obtained five pharmacophore hypothesis. The statistically significant and predictive ability of the model based on the partial least-square (PLS) regression method was steadily stable at 3 without adverse deviations. DDHHR_1 was discovered to have an acceptable R2 value of 0.3414, predictive power Q2 value of -0.2921, Pearson R-value of -0.2693, variance (SD) of 1.5044. In addition, the selected DDHHR_1 hypothesis has a P value of 0.0461, pIC50 value of 7.50 and high cross-validated coefficient of correlation indicating its suitability (Supp.Table 2). The atom-based 3D-QSAR model is displayed in Fig. 4.
Table 2
Docking score and binding energy of top hit compounds from various databases with PRMT3
S.No | Compound name/no. | Glide score (kcal/mol) | Glide energy (kcal/mol) | Prime energy | IFD score | Interacting residues |
| SGC707 | -7.687 | -45.364 | -11792.50 | -597.42 | Hbond: R396, E422 |
Selleckchem Compounds |
1 | Cladribine | -8.187 | -47.709 | -11656.55 | -591.01 | Hbond: D395, E422, T466 |
2 | Capecitabine | -7.032 | -45.523 | -11648.79 | -589.47 | Hbond: D389, R396, E422, T466 |
3 | Gefitinib | -5.228 | -48.011 | -11663.53 | -588.82 | Hbond: K392, R396 pi-cation: K392, R396, F399 |
ChemDiv Compounds |
4 | D585-0607 | -5.877 | -33.624 | -11707.58 | -592.25 | Hbond: D395, R396, E422 Salt bridge: K392, D395 Pi-cation: K392 |
5 | D175-0195 | -5.361 | -42.711 | -11673.09 | -589.12 | Hbond: V421, E422, K498 Salt bridge: E422 |
6 | F602-1150 | -5.084 | -40.018 | -11665.32 | -589.38 | Hbond: K392, D395, R396, E422 |
Lifechemicals compounds |
7 | F6127-0048 | -5.930 | -36.416 | -11652.74 | -589.84 | Hbond: K392, D395, R396 |
8 | F0852-0155 | -5.277 | -30.485 | -11585.95 | -586.60 | Hbond: K392, R396, E422 Salt bridge: K392 |
9 | F1361-0042 | -5.268 | -35.408 | -11585.09 | -586.67 | Hbond: D395, E422 Pi-cation: K392 |
3.4 Virtual screening for the identification of PRMT3 prospective Allosteric Inhibitors from diverse compound libraries:
Virtual screening was performed on the allosteric sites predicted from PASSer and Allosite-Pro employing Chemdiv PRMT library, selleckchem natural product library and lifechemicals natural compound library using DDHRR_1 pharmacophore hypothesis [49]. The allosteric residues in the protein pocket were identified as K392, H393, D395, R396, V387, D389, V420, E422, L424, T466, K498, A467, V501 and L503, while other parameters of the allosteric site (i.e., pocket volume, SASA, perturbation score, and allosite score) are provided in Table 1. To acquire a good cutoff score for the docking study, the compound libraries from Selleckchem, Chemdiv and Lifechemicals were subjected to virtual screening and found to interact with the allosteric site of PRMT3 protein through several molecular interactions. The GlideXP algorithm identified Cladribine as the first hit from the Selleckchem library with a docking score of -8.187 and binding energy of -47.709 kcal/mol. Different types of hydrogen bonding were seen with the D395, E422, T466 residues during the analysis of their interactions. Capecitabine and Gefitinib were the second and third-best ranking compounds from the natural compound library, with glide scores of -7.032 and − 5.228, respectively (Table 2). Both of these compounds also showed strong molecular interactions with the target protein allosteric site. Moreover, Gefitinib displayed strong hydrophobic type interactions ie. Pi-cation reactions were also present in large numbers between the allosteric site’s K392, R396, F399 amino acid residues. Full 3-dimensional and 2-dimensional interactive poses can be seen in Fig. 5.
From the ChemDiv PRMT3 library of compounds, GlideXP ranked D585-0607 as a first hit against the PRMT3 protein with glide score of -5.877. This compound made multiple hydrogen bonding molecular interactions with ASP395, ARG396 and GLU422 of the allosteric site, along with that Vanderwaals type of interactions with the LYS392 and ASP395 were also present between them. Moreover, LYS392 of the allosteric binding pocket exhibited strong hydrophobic interactions of Pi-cation type with the aromatic ring of this compound. The other two top-ranked compounds D175-0195 and F602-1150 had glide docking score of -5.361 and − 5.084, respectively (Supp. Table 3). These compounds are also rich in aromatic moieties resulting in strong hydrophobic interactions between the allosteric site of PRMT3 and the ligands. The second lead compound D175-0195 had strong hydrogen bonding of multiple types with the V421, E422 and K498 of the allosteric site. The third compound F602-1150 also showed strong hydrogen bonded interactions with K392, D395, R396 and E422. The 3-dimensional and 2-dimensional interactive poses of top hit compounds are depicted in Fig. 6.
Table 3
MM-GBSA binding energies (kcal/mol) with ligand-receptor conformational changes accounted in energy terms.
Compounds | MMGBSA -dG-Binding energy | MMGBSA -dG-bind in coulomb | MMGBSA -dG-bind Hbond | MMGBSA -dG-bind Lipo | MMGBSA -dG-bind vdw | MMGBSA -dG-bind (NS)- coulomb |
Selleckchem compounds |
Cladribine | -251.65 | -1.83 | -0.53 | -0.00 | -0.16 | -0.00 |
Capecitabine | -26.34 | -3.47 | -0.06 | -7.66 | -18.95 | -3.82 |
Gefitinib | -18.10 | -5.75 | -0.03 | -9.39 | -15.28 | -2.95 |
ChemDiv compounds |
D585-0607 | -30.40 | -44.24 | -2.73 | -10.67 | -34.35 | -45.42 |
D175-0195 | -41.20 | -33.69 | -3.45 | -14.91 | -34.67 | -35.70 |
F602-1150 | -37.39 | -23.21 | -3.16 | -12.71 | -35.12 | -22.05 |
Lifechemicals compounds |
F6127-0048 | -29.81 | -23.94 | -3.15 | -7.58 | -31.02 | -24.94 |
F0852-0155 | -13.78 | 39.92 | -1.72 | -6.21 | -28.66 | 38.40 |
F1361-0042 | -33.22 | -25.18 | -2.72 | -12.77 | -24.04 | -27.02 |
NS stands for “no strain” |
Table 4
ADMET and Druglikeness characteristics of the identified top hit compounds
Compound Name/No. | Physicochemical properties (Lipinski rule of five) | Solubility | Pharmacokinetics |
MW (g/mol) | HB donors | HB acceptors | No. of rotatable bond | Consensus log P | Log S (ESOL) | Log S (Ali) | GI absorption | CYP inhibitor |
Selleckchem compounds |
Cladribine | 285.69 | 3 | 6 | 2 | -0.26 | Very soluble | Soluble | High | No |
Gefitinib | 446.90 | 1 | 7 | 8 | 3.92 | Moderately soluble | Poorly soluble | High | Yes |
Capecitabine | 359.35 | 3 | 8 | 8 | 0.8 | Soluble | Soluble | High | No |
ChemDiv compounds |
D585-0607 | 335.40 | 3 | 4 | 6 | 1.48 | Soluble | Moderately soluble | Low | Yes |
D175-0195 | 361.78 | 3 | 6 | 8 | 2 | Soluble | Moderately soluble | High | Yes |
F602-1150 | 415.46 | 2 | 5 | 8 | 2.85 | Soluble | Moderately soluble | High | No |
Lifechemicals compounds |
F6127-0048 | 288.34 | 4 | 3 | 5 | 1.25 | Soluble | Soluble | High | No |
F0852-0155 | 222.20 | 4 | 5 | 3 | -0.12 | Very Soluble | Very Soluble | High | No |
F1361-0042 | 290.70 | 4 | 6 | 3 | 0.19 | Very Soluble | Very Soluble | High | No |
The Lifechemicals natural product-like compound library is a collection of 14,600 synthetic compounds based on chemical descriptors, natural likeness scoring and similarity to natural scaffolds. The first identified best-ranked compound from this library was F6127-0048 with significantly high molecular contacts with the PRMT3 allosteric site and made conventional hydrogen bonded interactions with K392, D395, R396 and the glide docking score was − 5.930. The other two best-ranked compounds, F0852-0155 and F1361-0042, also showed strong and diverse molecular interactions, with glide scores of -5.277 and − 5.268, respectively (Table 2). The second best-ranked compound exhibited salt-bridge interactions with the K392 of the allosteric site and engaged this residue via conventional H-bonding as well along with R396 and E422. The third hit compound F1361-0042 also had higher molecular interactions and engaged GLU422 twice, along with D395 in hydrogen bonded interaction. This molecule also displayed pi-cation interaction with K392 residue. Figure.7 reveals the three compounds from the lifechemicals natural product-like library in 3-dimensional and 2-dimensional interactive poses, while Table 2 shows their docking score and binding energy.
3.5 Molecular Dynamics Simulation analysis of PRMT3 with selected lead compounds:
Molecular dynamics simulation studies of the three selected Selleckchem complexes revealed that their stability during the 100ns virtual simulation time. The RMSD and the RMSF values of the three complexes can be seen in Fig. 9. The trajectory analysis of these complexes showed that Cladribine and Gefitinib displayed better stability compared to Capecitabine. Cladribine and Gefitinib had various types of contacts during the 100 ns time. The water-bridged assisted hydrogen bond interactions were the most prominent, followed by the conventional hydrogen bonding of the allosteric ASP395, ARG396 and GLU422 residues. LEU503 particularly within the allosteric site of PRMT3 made significant hydrophobic interactions in both the complexes. In addition, Gefitinib complex made hydrophobic contacts with the PHE399 and VAL420 allosteric residues, as shown in the protein-ligand histogram diagram in Fig. 9C. Both these ligands retained and showed the same interactions in the MD studies. The intensity of each of the allosteric residues with Gefitinib shown in Fig. 9C, where K392, D395, R396, F399 and L503 residues were in constant contact with the allosteric site of the PRMT3 along the 100 ns simulation trajectory.
The RMSD fluctuation of Cladribine initiated at 1.2 Å, which is fairly stable, and it remained at a range of 2 Å until the end. The RMSF plot reveals the residue wise fluctuation differences in the protein–ligand complex as well as to identify the conformational changes in the entire simulation period. During the entire simulations, all the protein–ligand complexes exhibited uniform fluctuations in most of the residues except significant displacement at the residues 359–370 close to nearly 6 Å indicating considerable mobility of the residues at those regions and presence of loop between two beta strands. These loop regions often house the orthosteric sites of the enzymes, indicating that allosteric binding considerably changes the conformation of the binding site.
The ligands D585-0607, D175-0195 from the ChemDiv PRMTs library in complex with the PRMT3 allosteric site remained stable during the simulation process. The RMSD of the protein in the complex can be seen in Fig. 10. D175-0195 started to fluctuate at 15 ns time, but the fluctuation was not too high and the change in RMSD was below the 3 Å threshold limit. The protein, RMSD, during the 100 ns simulation, remained like this in the stable form. The ligand, RMSD, also started low, but at about 15 ns time, the RMSD increased and remained like this until 35 ns. It came down after that, and the ligand then stabilized in the allosteric site. The increase of the ligand, RMSD, was due to the torsional movement of some of the movable bonds in this ligand. The interaction analysis of D175-0195 inside the allosteric site revealed that this compound was able to maintain the different types of molecular contacts previously seen in the XP docking studies. The histogram contact diagram can be seen in Fig. 11A, where the major molecular interactions seen in the MD simulation studies were hydrophobic type interactions and water-bridged hydrogen interactions at V421 and E422. The ligand D585-0607 remained stable until the 35 ns mark, after that the RMSD started to fluctuate too much inside the allosteric site (Fig. 10). The RMSD later aligned on the C-alpha backbone of the protein; however, the trajectory analysis suggests that the ligand was in contact with the allosteric site of the PRMT3 during the start of the simulation, but later, some major conformational changes occurred in the ligand structure that resulted in lower interactions. The interaction analysis of the trajectory revealed that this compound engaged in ionic, hydrophobic and H-bond interactions with the allosteric site (Fig. 10A), but its interaction intensity, was low compared to the top identified leads from the other libraries.
Compounds from the Lifechemicals natural-product like library in complex with PRMT3 allosteric site displayed too high RMSD fluctuations than the ligands from other libraries. However, ligand F6127-0048 showed higher intensity of interactions involving ionic, water-assisted hydrogen bonds and hydrophobic bonds at K392, D395, R396, V421 and E422.
3.3 Prime MMGBSA analysis:
The relative binding energies of the selected lead compounds from the various compound libraries with the allosteric site of the PRMT3 were determined using prime MM-GSBA methodology. It provided a variety of protein and allosteric interactions, including Vander Waals, hydrophobic, and coulombic interactions, among others. The MM-GBSA results showed that these allosteric inhibitors have strong binding energies with the allosteric site of PRMT3, and the binding energy terms (i.e., Vander Waals binding energy (dG Bind vdW), lipophilic (dG Bind Lipo), and coulomb energy (dG bind coulomb)) had lower negative values which correspond to stronger binding of the compounds to the allosteric site. Cladridine displayed highly favorable ligand binding with highest binding energy followed by D175-0195, F602-1150 and F1361-0042 (Table 3). Out of the nine compounds screened from three different compound libraries showed diverse scores and the results were tabulated in Table 3. The evaluation of binding free energy under different criterias based on the Prime MM-GBSA methods indicated Cladridine, D175-0195, F602-1150 and F1361-0042 with stronger binding energy and also supported the stability of the molecules with PRMT3.
3.4 ADME/T analysis:
Pharmacokinetics, physicochemical and drug-likeness properties were evaluated through SwissADME online tool, which employs the simplified molecular-input line-entry standard (SMILES) format. The default parameters for orally active drugs must comply with the Lipinski’s rule of five, which asserts that molecular weight (MW) < 500g/mol, hydrogen bond acceptors < 10, hydrogen bond donors and log P values < 5, and the number of rotatable bonds < 10 [29]. The molecular structures of all the nine lead molecules were imported into SwissADME and the different criteria of drug-likeness analysis are depicted in the bioavailability radar in Fig. 8 and table.4. All the nine compounds showed the highest bioavailability score of 0.55 indicating as better oral drug candidates for PRMT3 allosteric inhibition. None of the compounds displayed violationsfor Lipinski’s rule, PAINS and Brent alerts. Compounds Gefitinib, Capecitabine, F6127-0048 and F602-1150 exhibit good drug-like qualities with strong solubility and no excretion problems since it does not interact with pharmacokinetics and does not inhibit CYP enzymes.