3.1. Molecular Docking and Contact Analysis
Among the 265 phytocompounds of OB, chicoric acid (CA), rutin, and salvianolic acid A (SA) were screened as the top-hit compounds. Chicoric acid (-9.136 kcal/mol), rutin (-9.135 kcal/mol), and SA (-11.838 kcal/mol) demonstrated superior docking scores against the E-protein compared to the standard, mycophenolate (-4.481 kcal/mol). Molecular docking is a computational approach utilized to predict the binding affinity between a small molecule and a target protein [30]. Generally, higher docking scores imply stronger binding interactions [38, 39].
The interaction between the ligand and the target protein encompasses the establishment of hydrogen bonds, ionic interactions, salt bridges, and hydrophobic interactions, including pi-cation interactions. These interactions are crucial for the binding affinity and stability of the ligand within the protein's binding site [40]. Consequently, hydrogen bonds and hydrophobic interactions were employed to assess the binding affinity of the identified compounds to the target protein. The majority of the amino acid residues within the binding pocket of the target protein participated in hydrogen bond formation with hit compounds than the standard (Fig. 3). Moreover, at Lys336, CA depicted salt-bridge interaction, whereas rutin had pi-cation interaction on the same amino acid residue (Fig. 3B, C). These interactions indicate the ligands' closer and stronger binding affinity to the target protein [39, 41].
3.2. Molecular Dynamics (MD) Analysis
In a conventional molecular dynamics (MD) simulation, atoms and molecules are permitted to move over a brief period, with the interactions between them determined by force field parameters. These parameters typically depict the temporal evolution of bond lengths, angles, torsions, non-bonding van der Waals interactions, and electrostatic forces among atoms [42]. In this study, MD simulations were evaluated using RMSD, RMSF, protein secondary structure elements (SSE), protein-ligand contacts, the radius of gyration (rGyr), and solvent-accessible surface area (SASA) for each protein-ligand complex to determine their stability [43]. The binding mechanism and stability of the reference compound within the target protein complex served as a benchmark for evaluating the hit molecules. While all the hit compounds exhibited relatively better or comparable stability within the binding pocket of the target protein, only CA and rutin consistently maintained stability across various MD simulation parameters as indicated in the MD analysis results.
3.2.1. RMSD, RMSF, and SSE Analysis
The protein Root Means Square Deviation (RMSD) analysis was performed on protein-ligand complexes to assess the envelope protein backbone's overall stability and conformational changes during the simulation. The RMSD values of the protein's backbone Cα atoms demonstrated greater stability with CA and rutin complexes, as these complexes exhibited relatively lower and less fluctuating RMSD values, which suggests that the binding of the hit compounds to domain III of envelope protein without causing instability in the protein structure (Fig. 4A). The protein complexed with CA displayed an average RMSD value of 3.021 Å, peaking at 12.7 ns (5.511 Å), and then became stable after 20 ns in the entire simulation time. Similarly, the protein complexed with rutin exhibited stable RMSD values between 1.672 and 5.950 Å, averaging 3.774 Å, though some fluctuations were observed after 50 ns. However, the protein complex exhibited more significant fluctuations with SA and mycophenolate throughout the entire simulation period. The average (and range of) RMSD values were 4.411 Å (2.067 to 6.38 Å) for SA and 5.351 Å (1.209 to 7.814 Å) for mycophenolate complex. These results suggest that the backbone Cα atoms of the E-protein achieve greater stability when complexed with CA and rutin than the standard.
Ligand RMSD
The RMSD values of CA, rutin, SA, and mycophenolate displayed diverse stability profiles within the protein's binding pocket, as shown in Fig. 4B. The protein complexed with CA displayed an average RMSD value of 3.892 Å in a range of 1.06 to 8.717 Å, with minor fluctuations observed between 60 and 75 ns. Similarly, the protein complexed with rutin exhibited stable RMSD values ranging between 1.357 and 9.579 Å, averaging 4.413 Å, though some fluctuations were observed after 95 ns. However, the SA, in the binding pocket of the target protein, experienced significant scattering with an average RMSD value of 6.196 Å within a range of 1.12 to 12.307 Å. The average RMSD value of mycophenolate was 5.351 Å within the range of 1.12 to 12.307 Å. Overall, the lower and less fluctuating RMSD values of CA and rutin suggested strong and stable binding of these hits to the protein than SA and the standard.
RMSF and SSE Analysis
The root means square fluctuation (RMSF) of protein residues is instrumental in identifying flexible regions within the protein structure by mapping local variations in amino acid residue movement along the protein sequence [44], while secondary structural elements (SSE) provide insights into the overall folding and stability of the protein [45]. RMSF measures the flexibility of individual residues in the protein, indicating which regions are more dynamic. The E-protein in the complex with CA, rutin, and SA showed less fluctuation and lower peaks than mycophenolate. The average (and range of) ligand RMSF value of CA, rutin, SA, and mycophenolate complex was 1.707 Å (0.577 to 9.608 Å), 1.896 Å (0.626 to 9.92 Å), 1.880 Å (0.725 to 9.682 Å), and 2.173 Å (0.233 to 8.501 Å), respectively. These findings showed that CA and rutin were associated with the protein without causing much changes to the overall conformational of the protein.
The protein-hit compound complexes showed major peaks and fluctuations in naturally flexible regions, such as the loop regions and at the protein's terminal position residues, although the binding was stable at the binding pocket of the protein. In the case of the protein-CA complex, major peaks were observed at loop position 101 (Trp101; RMSF, 3.842 Å), 106 (Gly106; RMSF, 4.173 Å), and 107 (Phe107; RMSF, 4.142 Å); and at the terminal residue 402 (Thr402; RMSF, 7.066 Å), 403 (Leu403; RMSF, 8.501 Å), and 404 (Gly404; RMSF, 9.608 Å). A similar RMSF fluctuation of amino acid residues was noticed for the protein's complex of rutin, although the amplitude of the RMSF value fluctuation was highest for SA and the standard (Fig. 4C). The SSE analysis of the protein’s complex of reported hit compounds similarly demonstrated the overall stability of the protein-ligand complexes. The percentile composition of the α-helices and β-strands was 4.36% and 41.54% for mycophenolate, 4.47% and 41.68% for CA, 4.72% and 41.90% for rutin, and 4.57% and 42.00% for SA, respectively (Supp. Fig.S1). These findings from RMSD, RMSF, and SSE analysis revealed that CA and rutin were more firmly bound than the other two ligands, with minimal changes in the conformation of the target protein [30]. An investigational drug that binds strongly to domain III (DIII) of the viral envelope protein was reported to effectively inhibit virus entry and the subsequent infection of target host cells [46]. The loop regions of the envelope protein, especially at positions ranging from 105 to 107, are crucial for conformational changes of the protein required during fusion, attachment, and the subsequent infection of host cells [8]. The stable binding of the hit compounds at DIII along with increased flexibility of amino acid residues in the loop region due to binding of the hit compounds could block the virus entry into the host cells.
3.2.3. Radius of gyration and Solvent-Accessible Surface Area Analysis
The radius of gyration (rGyr) represents the average distance of atoms from their axis of rotation, calculated as the root mean square distance. It is also the structural parameter that measures the protein’s compactness and extendedness upon ligand binding and is equivalent to the protein’s principal moment of inertia [45]. The MD simulation analysis findings of the rGyr value for the E proteins’ backbone atoms in 100 ns simulation time are shown in Fig. 5A.
The complex of CA, rutin, and SA with E-protein was examined to analyze the influence of the ligand binding on the overall structural compactness of the protein. The average (and range of) rGyr value for CA, rutin, SA, and mycophenolate was 6.459 Å (5.805 to 6.804 Å), 4.600 Å (4.324 to 5.207 Å), 5.082 Å (4.231 to 6.119 Å), and 4.017 Å (3.733 to 4.304 Å) respectively. The rutin complex showed a similar rGyr value to the reference compound, with slight jerks at 22–26 ns, 43–46 ns, and 84–89 ns. However, the CA complex of the protein showed a higher but stable rGyr pattern, while the SA complex experienced lower and less fluctuating rGyr till 46.20 ns before it became more fluctuating with higher rGyr value throughout the simulation time (Fig. 5A). The less fluctuating but higher rGyr values of CA and rutin protein complex suggest the increased extendedness of the protein’s structure to accommodate the bound hit compounds and the ligands' stability on the target protein's binding pocket. This increased extendedness of the protein structure allows for better fitting and stabilization of the ligands within the binding pocket [49].
The solvent-accessible surface area (SASA) of the target proteins complexed with the hit and reference compounds was analyzed to trace changes in the surface area of the proteins accessible to water molecules to evaluate the protein’s unfolding and folding in the presence of ligands [43]. As depicted in Fig. 5B, the target proteins appeared to have varying values of SASA depending on the specific ligand interacting with the target protein. In the case of CA, rutin, SA, and mycophenolate complexed with E-protein, the average SASA value of the protein was 261.358 Å2, 285.239 Å2, 347.225 Å2, and 228.493 Å2, respectively, with noticeable fluctuation at 10ns and 83ns for rutin, from 60 to 70 ns for CA, throughout the simulation time after 43 ns for SA complex. The protein complex showed a similar pattern of SASA fluctuation for the reference compound with that of CA and rutin (Fig. 5B). The relatively lower SASA value of the protein complexed with chicoric aid and rutin indicates that hydrophobic residues of the protein were less exposed to the solvent system, and the more stable and folded protein conformation was maintained during ligand binding [50]. However, the SA complex of the protein showed more fluctuating and higher SASA value than the standard, suggesting less stability of the protein-ligand complex.
3.2.5. Protein-Ligand Contact Analysis
As depicted in Fig. 6, most of the amino acid residues of the E-protein formed water bridge and hydrogen bond interactions with hit compounds over the simulation period. The protein–ligand contact analysis of E-protein when complexed with the reference compound mycophenolate, showed that Asn358 (~ 40%), and Leu354 (> 70%) residues were involved in hydrogen bond formation during the whole simulation period. A hydrophobic interaction was formed by Pro350 and Leu354 residues for about 10% of the total interaction time. Asp37, Thr40, Lys336, and Lal335 residues engaged in a water bridge interaction for more than 50%, 30%, 20%, and 30% of the total running time respectively (Fig. 6A). The protein-CA complex formed hydrogen bond for about 50% of running time at Met34, Lys336, and Asn358 residues, while Met303, Val340, and Val340 residues showed hydrophobic interaction for approximately 25% of analysis running time; Asp37, Lys336, Ile337, Ile339, and Asn358 residues contributed to water bridges interaction for about 50% of the experimentation time with Asp37, Lys336, and Asn358 displaying significant hydrogen-bond interaction as well. Furthermore, the ionic bond was formed at Lys336 for about 10% of the simulation time with CA (Fig. 6B). Leu354 residues of the protein in the rutin–E-protein complex showed hydrogen bond interaction for greater than 50% of total interaction time; Asp37 and Asn358 residues were also noted to form water bridge interaction for at most 75% of simulation time. Val340 and Pro350 residues formed hydrophobic bonds for approximately 25% of the interaction time (Fig. 6C). The protein complex of SA formed hydrogen bonds at Asp37, Lys38, and Leu354; water bridge at Asp37, Lys336, and Asn358; hydrophobic interaction at Leu296 and Val357; and ionic interaction at Lys249 and Lys336 for more than 50%, 10%, 25% and 5% of the simulation time respectively (Fig. 6D). Overall, the majority of the amino acid residues of the protein partook in hydrogen bond and water bridge interactions with the hit compounds, which indicates the stable binding of the ligands to the binding target of the proteins.
3.2.6. Binding Free Energy Calculation
All MD simulation trajectories, generated at 100 ns intervals, were subjected to free binding energy calculations against complexes using the MM/GBSA method [51]. The average binding energy, coulomb energy, covalent binding energy, hydrogen-bonding correction, pi-pi packing correction, lipophilic energy, generalized born electrostatic solvation energy, and van der Waals energy were computed for CA, rutin, SA, and mycophenolate complex of E-protein. A more negative free energy value indicates a stronger binding affinity of the ligand to the target protein [26].
Chicoric acid, rutin, and SA demonstrated lower net binding free energy (ΔGbind) against E-protein than mycophenolate (Table 1). Salvianolic acid A scored the lowest net binding free energy (ΔGbind) than CA and rutin against the target protein. Interestingly, ∆GBind_Coulomb, ∆GBind_Lipo, and ∆GBind_vdW contributed substantially towards ΔGbind of the protein-ligand complexes, whereas ∆GBind_Covalent showed unfavorable energy to the ΔGbind for all the complexes. Although ∆GBind_Solv_GB favored net binding free energy for the complex of mycophenolate with E-protein, it unfavored the net binding free energy across the hit compounds.
Table 1
Binding free energy calculation for E protein-ligand complex using prime/MM-GBSA approach
Energy Contribution (kcal/mol) | Hit phytocompounds |
Chicoric acid | Rutin | Salvianolic acid A | Mycophenolate |
ΔGbind | -57.421 | -50.265 | -61.599 | -39.857 |
ΔGbind_colummb | -15.504 | -38.244 | -23.221 | 30.787 |
ΔGbind_covalent | 6.086 | 6.237 | -0.950 | 0.580 |
ΔGbind_Hbond | -3.977 | -4.749 | -4.464 | -2.062 |
ΔGbind_Lipo | -16.390 | -10.319 | -13.898 | -12.915 |
ΔGbind_Packing | -0.001 | 0.000 | -0.016 | 0.000 |
ΔGbind_Solv_GB | 20.309 | 32.909 | 17.657 | -24.313 |
ΔGbind_vdW | -47.945 | -36.098 | -36.707 | -31.934 |
3.2.7. Principal Component Analysis (PCA)
Principal component analysis (PCA) was done to get insight about the significant motion of the protein owning to ligand binding through eigenfractions derived from a covariance matrix. The first ten eigen modules were taken for PCA, as the majority of functionally relevant dynamics of the proteins are accounted for by the top 10 to 20 eigen models [52]. Consequently, the target protein complexes with CA, rutin, SA, and mycophenolate showed simulated motions between 52.664% and 65.786% from the first two eigenvectors. The first two principal components (PC1: black cluster, and PC2: red cluster) were selected for the projection of the major dynamics of the protein. In a 2D plot of PCA (Fig. 7), the complex occupying minimal phase space with a condensed conformational distribution is more stable, while a complex that takes a wider space and has a dispersed distribution indicates a noteworthy alteration in the protein structure and less stability of the protein-ligand complex [53]. PCA helps to identify significant conformational alteration in the protein structure, which could lead to loss of the protein function, as the result of ligand binding [34].
The complex system involving the E-protein and mycophenolate exhibited the highest level of correlated motion, with percentages of 43.679% for PC1 and 22.107% for PC2. Following this, the apo E-protein (PC1: 39.614%; PC2: 23.498%), SA (PC1: 37.92%; PC2: 14.779%), CA (PC1: 35.547%; PC2: 22.896%), and rutin (PC1: 34.943%; PC2: 17.702%) complexes of the protein demonstrated progressively lower levels of motion. These findings indicated that the hit compounds' binding stabilized the E-protein compared to the standard. These high PCA values could be linked to the increased flexibility of the amino acid residues in the binding site, as was noticed from the RMSF values of the mycophenolate-protein complex. The E-protein's CA and rutin complexes occupied relatively less phase space with a condensed cluster because of their stable complex systems (Fig. 7C-E). These stable complexes of CA and rutin suggest that they could inhibit the virus attachment to host cells thereby blocking subsequent virus entry into the host cells.
3.2.8. Free Energy Landscape Analysis
The Free Energy Landscape (FEL) helps to pinpoint the structure of the ligand-protein complex with the lowest energy (0 Kcal/mol) from all potential conformations generated during MD simulation [54]. The 2D free energy profile in the left panel of Fig. 8, displaying the lowest energy coordinates of PC1 and PC2, was utilized to extract the frame corresponding to the energy minima. Subsequently, the docked complex was superimposed onto the structure indicative of the lowest free energy minima, allowing for the examination of changes in the ligands' binding poses, as shown in the right panel of Fig. 8. In the left panel, the energy states of the structural conformations are depicted, with the black color representing the lowest stable state. The right panel illustrates the ligand's lowest energy conformation post-MD simulation in green, while the blue color indicates the conformations of the docked ligand before the MD simulation.
The RMSD values for local minima structure of the complex of E-protein with mycophenolate (PC1: 17.562; PC2: -10.774), CA (PC1: 12.78; PC2: 7.764), rutin (PC1: -30.454; PC2: -9.945), SA (PC1: -10.989; PC2: -5.769), and superimposed with the corresponding docked pose, were 0.9523 Å, 1.5950 Å, 2.0368 Å, and 2.6439 Å, respectively, (Fig. 8A-D). Moreover, CA, rutin, and SA formed one hydrogen bond interaction at the most stable conformation via Asn358, Asn358, and Lys38, respectively. In addition to salt-bridge interaction at Lys360 with CA and at Lys336 with rutin, the E-protein formed pi-cation interaction with the later hit compound at Lys336 (Supp. Fig. S2). These observations suggest that CA and rutin formed a more stable complex with E-protein, which could result in the inhibition of virus attachment and entry into host cells.