Sequence retrieval and protein structure prediction:
The protein Fasta sequence of TCF7L2 causing diabetes mellitus was retrieved from the Uniprot database under accession number Q9NQB0 to generate the high-quality protein model. Robetta was used to generate a protein model, and further, model 1 with a confidence score of 0.89 was selected to design the drug Figure 2
Protein 3D structure validation
PROCHECK comprehensibly displays the Ramachandran plot based on the polypeptide chain's angle pairs (Ф, Ѱ) in a known protein structure. The Ramachandran plot statistics determined the overall quality factor of 89.9% of amino acid residues shown in Figure 3. The ERRAT estimated the overall quality of the protein model was 86.655 by analyzing the statistics of non-bonded interactions between different types of atoms shown in Figure 4. The QMEAN server showed that the Z-score value indicated the overall protein structure quality, estimated at -1.39 and considered good, as shown in Figure 5.
Protein Network Map
The protein-protein interaction map of the TCF7L2 tells us that this protein interacts with five more proteins. These proteins include BCL9, CTBBB1, EP300, CTBP1, and NLK which can be seen in Figure 6. The number of nodes was 6 and 11 edges with a node degree of 3.67. The average clustering coefficient was 0.839. CTNNB1 or Catenin beta-1, is a key protein involved in the Wnt signaling pathway, regulating cellular homeostasis and playing roles in cancer development. BCL9, or B-cell CLL/lymphoma 9 protein, participates in signal transduction through the Wnt pathway, enhancing beta-catenin's transcriptional activity, and it is a member of the BCL9 protein family. CTBP1: Corepressor in Wnt pathway; targets GLIS2, BCL6; regulates Golgi structure, BAT differentiation; has dehydrogenase activity. Serine/threonine-protein kinase NLK; Regulates transcription factors involved in cell fate determination. Acts downstream of WNT5A, MAP3K7/TAK1, and HIPK2 in the non-canonical Wnt signaling pathway. Forms a complex with SETDB1, leading to PPARG target gene transcriptional silencing. EP300, also known as Histone acetyltransferase p300, is a multifunctional protein involved in histone acetylation, chromatin remodeling, and transcriptional regulation. It acetylates all four core histones, leading to histone acetylation that facilitates transcriptional activation by providing an epigenetic tag. Additionally, EP300 mediates cAMP-gene regulation by binding to phosphorylated CREB protein and acetylates histone H3 at specific lysine residues, such as Lys-122 and Lys-27, which promotes transcription by inducing nucleosome instability.
Phytochemicals Retrieval
Phytochemicals of Moringa oleifera essential oil were used as drug candidates against the TCF7L2 protein. A total of 25 phytochemicals were selected for further analysis. The phytochemicals that were retrieved can be seen in Table 1.
Table 1: Phytochemicals of Moringa retrieved from PubChem
Sr.
|
Phytochemicals
|
PubChem CID
|
1
|
Camphene
|
6616
|
2
|
Sabinene
|
18818
|
3
|
6-Methyl-5-hepten-2-one
|
9862
|
4
|
Myrcene
|
31253
|
5
|
p-Cymene
|
7463
|
6
|
Limonene
|
22311
|
7
|
(z)-beta-Ocimene
|
5320250
|
8
|
(E)-beta-Ocimene
|
5281553
|
9
|
Terpinolene
|
11463
|
10
|
Linalool
|
6549
|
11
|
b-Thujone
|
91456
|
12
|
Cis-p-Menth-2-en-1-ol
|
5319367
|
13
|
borneal
|
64685
|
14
|
Terpinen-4-ol
|
11230
|
15
|
p-Cymen-8-ol
|
14529
|
16
|
Carvotanacetone
|
6432475
|
17
|
Piperitone
|
6987
|
18
|
Thymol
|
6989
|
19
|
Carvacrol
|
10364
|
20
|
Alpha-cubebene
|
442359
|
21
|
Eugenol
|
3314
|
22
|
Beta-cubebene
|
93081
|
23
|
Germacrene D
|
5317570
|
24
|
Valencene
|
9855795
|
25
|
Viridiflorene
|
10910653
|
ADMET analysis
The first screening of phytochemicals was based on the ADMET properties. ADMET studies were carried out on 25 phytochemicals to determine physiochemical and pharmacokinetic parameters of the candidate drug covering molecular weight, hydrogen bond donor and acceptor, no. of rotatable bonds, blood-brain barrier, gastrointestinal tract, absorption, lipophilicity, solubility, drug-drug interaction, metabolism, synthetic accessibility, and permeability. After screening, 7 phytochemicals passed out this screeding from which the carvacrol was the lead compound. Table 2 shows the ADMET properties of carvacrol.
Table 2: ADMET parameters including physiochemical and pharmacokinetics features of Carvacrol bioactive compound.
The boiled egg picture represents an intuitive estimation of brain penetration and passive gastrointestinal absorption, with the yellow region indicating a high probability of brain penetration and the white region indicating a high probability of passive gastrointestinal absorption
25. The boiled egg model of the bioactive compound Carvacrol shows that the drug will penetrate the blood-brain barrier (BBB) and Gastrointestinal tract, as shown in Figure 7.
Figure 7: SwissADME interpreted that the carvacrol will penetrate the Blood-Brain Barrier (BBB) and Gastrointestinal tract, as represented by the red dot.
Oral toxicity of carvacrol was predicted on a scale of 4 with 100% accuracy and similarity. Carvacrol passed all the points except Tox21-Stress response pathways, which were active with 1.0 probability. The other checkpoints can be seen in Table 3 and Figure 8. This predicts that carvacrol was the lead compound in other analyses and has low oral toxicity. Drug likeness of all phytochemicals was predicted with molinspiration and FAFdrug4. Both tools passed Lipinski’s rule of 5 for carvacrol.
Table 3: ProTox Ⅱ server predictions for carvacrol
Classification
|
Target
|
Prediction
|
Probability
|
Organ toxicity
|
Hepatotoxicity
|
Inactive
|
0.75
|
Toxicity end points
|
Carcinogenicity
|
Inactive
|
0.6
|
Toxicity end points
|
Immunotoxicity
|
Inactive
|
0.96
|
Toxicity end points
|
Mutagenicity
|
Inactive
|
0.99
|
Toxicity end points
|
Cytotoxicity
|
Inactive
|
0.89
|
Tox21-Nuclear receptor signaling pathways
|
Aryl hydrocarbon Receptor (AhR)
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Androgen Receptor (AR)
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Androgen Receptor Ligand Binding Domain (AR-LBD)
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Aromatase
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Estrogen Receptor Alpha (ER)
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Estrogen Receptor Ligand Binding Domain (ER-LBD)
|
Inactive
|
1
|
Tox21-Nuclear receptor signaling pathways
|
Peroxisome Proliferator Activated Receptor Gamma (PPAR-Gamma)
|
Inactive
|
1
|
Tox21-Stress response pathways
|
Nuclear factor (erythroid-derived 2)-like 2/antioxidant responsive element (nrf2/ARE)
|
Inactive
|
1
|
Tox21-Stress response pathways
|
Heat shock factor response element (HSE)
|
Inactive
|
1
|
Tox21-Stress response pathways
|
Mitochondrial Membrane Potential (MMP)
|
Active
|
1
|
Tox21-Stress response pathways
|
Phosphoprotein (Tumor Supressor) p53
|
Inactive
|
1
|
Tox21-Stress response pathways
|
ATPase family AAA domain-containing protein 5 (ATAD5)
|
Inactive
|
1
|
Pharmacophore characterization of Carvacrol
Pharmacophore characterization helps to understand the nature of the ligand and the bonds that a ligand can form with the receptor. Pharmacophore characterization of carvacrol revealed that it has 3 hydrophobic, 1 hydrogen donor, 1 hydrogen acceptor, and 1 aromatic region Figure 9 and 10. These are the potential ability of the carvacrol to bond with receptor proteins. The pharmacophore illustration of carvacrol with TCF7L2 can be seen in Figure 11.
Binding Sites Determination
DeepSite was performed to predict binding pockets to estimate the binding affinity among receptor-ligand complex. The exact binding positions and the estimated x, y, and z dimensions are shown in Table 4. From the 6 binding pockets, pocket 1 was selected for the binding affinity prediction based on a score of 0.98. Pocket 1 of the TCF7L2 protein 3D structure is shown in Figure 12.
Table 4: Evaluation of binding sites center positions of TCF7L2 protein and their scores estimated by DeepSite.
Site No.
|
Scores
|
Centers (x, y, z)
|
1
|
0.98
|
63.479, -81.389, 60.259
|
2
|
0.95
|
103.480, -129.389, 112.260
|
3
|
0.91
|
57.479, -153.389, 100.260
|
4
|
0.90
|
77.480, -91.389, 130.259
|
5
|
0.95
|
139.479, -155.389, 94.260
|
6
|
0.71
|
137.479, -179.389, 124.260
|
MD docking and Molecular Interactions
The phytochemical screening was performed using Autodock vina to screen out the binding energies of the 7 best phytochemicals based upon ADMET and Pharmacophore characterization. Carvacrol was selected with its lowest binding affinity (-5.5 kcal/mol) compared to other top molecules shown in Table 5. Molecular docking interpreted that Carvacrol was found to have efficient binding with the active site residues of TCF7L2 protein.
Table 5: 2D structure of top, binding molecules to the active sites of TCF7L2 protein with its binding score in (kcal/mol)
Furthermore, interaction analysis was interpreted using Ligplot+, Pymol, and Discovery Studio to visualize the residues, bond distances, their types, and categories. There were one hydrogen and seven hydrophobic interactions. One hydrogen bond interaction was predicted with a distance of 2.20 Å with Tyr137. Two Pi Sigma Bonds are formed with Tyr140, and four alkyl bonds with Phe130, Tyr137, and Tyr140. One Pi-Pi stacked bond was formed with Tyr137. This molecular interaction can be seen in Figure 13 and Table 6.
Table 6: Interpretation of bond length, categories, and types in selected docked complexes through Discovery Studio.
MD Simulation of Dock Complex
The results of the molecular dynamics (MD) simulation reveal the Root Mean Square Deviation (RMSD) of both the protein and the ligand, TCF7L2 and Carvacrol respectively. The RMSD evolution plot illustrates how the RMSD of the protein fluctuates over the course of the simulation, providing insights into its structural conformation Figure 14. In the RMSD graph, the left Y-axis represents the Protein RMSD, while the right Y-axis indicates the Ligand RMSD. The Protein RMSD plot demonstrates the deviation of the protein structure from a reference frame backbone throughout the simulation. It is noted that fluctuations of 1-3 Å are considered acceptable for small, globular proteins, indicating normal thermal fluctuations. Larger deviations suggest significant conformational changes occurring during the simulation. Regarding ligand stability, the Ligand RMSD plot, labeled 'Lig fit Prot', showcases the stability of Carvacrol within the protein's binding pocket. This RMSD is calculated after aligning the protein-ligand complex on the protein backbone and measuring the deviation of the ligand's heavy atoms. If the Ligand RMSD values exceed those of the protein, it suggests potential diffusion of the ligand away from its initial binding site. The RMSD graph indicates stabilization of both the protein and the ligand after approximately 70 nanoseconds into the simulation Figure 14. This stabilization suggests that the interactions between TCF7L2 and Carvacrol have reached equilibrium, allowing for a more rigorous analysis of the molecular dynamics.
The Root Mean Square Fluctuation (RMSF) analysis provides valuable insights into localized changes occurring along the protein chain during the protein-ligand molecular dynamics (MD) simulation. RMSF for each residue is calculated based on the trajectory time, reference time, positions of atoms in the residue, and their average square distance. Peaks in the RMSF plot highlight regions of the protein experiencing the highest fluctuations during the simulation. In the RMSF plot of TCF7L2 protein, comprising 616 amino acids, fluctuations are observed across the entirety of the protein chain. As commonly observed, the terminal regions (N- and C-terminal) exhibit greater fluctuations compared to structured regions such as alpha helices and beta strands, which typically demonstrate more rigidity. This behavior is consistent with protein dynamics expectations. Specifically, in this simulation, the most significant fluctuation was observed at residue 120, with a displacement of approximately 20 angstroms from the reference position. Following this, residues 390 and 410 exhibited notable fluctuations at a distance of approximately 18 angstroms from their respective reference positions Figure 15.
According to the analysis depicted in Figure 15, the ligand, Carvacrol, exhibited a consistent RMSD pattern throughout the simulation, initiating at 5 nanoseconds (ns) and stabilizing at 0.75 Šover the course of the 100 ns runtime. The 'extendedness' of Carvacrol, as represented by the Radius of Gyration (rGyr), remained steady within the range of 45-55 Šuntil the conclusion of the simulation, indicating stable behavior. Intramolecular hydrogen bonds (intraHB) were not observed in 100 ns. The Molecular Surface Area (MolSA), calculated with a 1.4 Šprobe radius, remained constant at 82 Ų throughout the simulation, indicative of a consistent van der Waals surface area. The Solvent Accessible Surface Area (SASA) showed minimal variation, maintaining a near-constant value of 80 Ų after 20 ns until the completion of the 100 ns simulation. Furthermore, the Polar Surface Area (PSA) commenced at 44 Ų and stabilized at 48 Ų for the duration of the simulation, with contributions solely from oxygen and nitrogen atoms. These all properties of ligand can be seen in Figure 16.
The MMGBSA analysis after 100 nanoseconds of simulation for the TCF7L2-carvacrol interaction yielded a binding free energy of -39.81 kcal/mol at the start of the simulation (0 ns) and -31.49 kcal/mol at the end of the simulation (100 ns). The negative values indicate favorable binding between TCF7L2 and carvacrol, suggesting a stable and energetically favorable protein-ligand interaction. This result implies that carvacrol effectively binds to the active site of TCF7L2 throughout the simulation, forming stable complexes with the protein. The decrease in binding free energy from the initial to the final time points could indicate subtle changes in the protein-ligand interaction over the course of the simulation, potentially reflecting adaptations in the binding mode or conformational dynamics within the complex.