Data mining using artificial intelligence and molecular dynamics analysis to detect HIV-1 reverse transcriptase RNase H activity inhibitor

doi:10.21203/rs.3.rs-3000807/v1

Download PDF

Research Article

Data mining using artificial intelligence and molecular dynamics analysis to detect HIV-1 reverse transcriptase RNase H activity inhibitor

https://doi.org/10.21203/rs.3.rs-3000807/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 10 Aug, 2023

Read the published version in Molecular Diversity →

You are reading this latest preprint version

In this study, we developed a process to identify an HIV-1 protein target and a new drug candidate. Genomic analysis was conducted on HIV-1 genomes to identify a viable target for disrupting viral replication and the reverse transcriptase enzyme. Based on MAUVE analysis, we selected the RNase H activity of the reverse transcriptase as the potential target due to its low mutation rate and high conservation. We screened 94,000 small molecule inhibitors and performed virtual screening. Molecular dynamics simulations and MM/PBSA were used to validate hit compounds' stability and binding free energy. Phomoarcherin B, known for its anticancer properties, emerged as the top candidate, showing potential as an inhibitor of HIV-1 reverse transcriptase RNase H activity.

computational biology

drug discovery

HIV-1

molecular dynamics

molecular docking

pol gene

reverse-transcriptase RNase H

Human immunodeficiency virus 1 (HIV-1) has remained a global public health issue since its first identification in 1981. HIV-1 infection results in acquired immunodeficiency syndrome (AIDS) that leaves the host's immune system defenceless against secondary infections [1]. The World Health Organization refers to HIV-1 as a "global epidemic" [2], and according to the Joint United Nations Program on HIV/AIDS (UNAIDS) global statistics, around 33.6 million–48.6 million individuals have died cause of AIDS-related illnesses since its emergence up to 2021, a total of 33.9–43.8 million individuals still live with the virus, most of them living in sub-Saharan Africa [3, 4].

HIV is a complex retrovirus; like other retroviruses, it stores its genome as a pair of ssRNA molecules of ~ 9 kb. The genome contains 1) the gag gene, which encodes the structural proteins, mainly the protein capsid, the matrix protein, and the nucleocapsid; 2) the pol gene, which encodes the reverse transcriptase enzyme (R.T.), the protease enzyme, and the integrase enzyme (IN), 3) and the env gene encode the membrane glycoprotein 120 and glycoprotein 41. Additionally, the HIV genome also encodes six regulatory proteins, such as Tat, Rev, Nef, Vif, Vpr, and Vpu, that are responsible for its pathogenicity and replication in the host [5]. Once the virus enters the host, the capsid disintegrates, and the viral RNAs are reverse-transcribed into DNA molecules via R.T., which starts with an RNA/DNA hybrid followed by further cleavage of the RNA and synthesis of dsDNA. The proviral dsDNA is further integrated into the host genome via the integrase enzyme, which remains a reservoir for the virus until the cell is activated, upon which the cellular transcription and translation mechanisms are hijacked to produce the viral proteins and RNA genomes [6–8]. Nucleoside, nucleotide reverse transcriptase inhibitors (NRTIs), nonnucleoside reverse transcription inhibitors (NNRTIs), protease inhibitors (P.I.s), entry or fusion inhibitors, and integrase strand transfer inhibitors (INSTIs) are the major classes of antiretroviral therapy drugs targeting five different phases of the HIV life cycle. As known, 17 registered drugs are available for treatment against HIV [9].

In the management of AIDS, lifelong drugs are administered. However, they cause immune-suppressive effects with possible other expected disorders. Therefore, developing new drugs targeting the exact issue is essential. Antiretroviral therapy against HIV has various properties that change a fatal disease into a chronic form. Therefore, patients should be treated with permanent, potentially lifelong drugs to suppress HIV replication. However, therapy is unsuccessful due to limitations that require strict lifelong drug adherence. As known, the drugs are commonly well tolerated, but they have some short-term toxic effects, which could result in potential known and unknown long-term toxic effects. This situation causes persistent immune dysfunction and carries the risk of various non-AIDS-related complications, such as heart, bone, liver, kidney, and neurocognitive diseases [9].

Computational drug discovery methods have gained huge momentum recently, especially with the availability and accessibility of vast computation resources for lower costs. Virtual screening has been previously studied to identify lead compounds potentially inhibiting several HIV-1 proteins and enzymes, such as the RNase domain. However, such studies were based solely on molecular docking, pharmacophore, and ADMET assays that consider mainly the best docking pose the ligands can have against the target protein in a vacuum-like condition, which may not reflect their interaction in the physiological cell condition [10–12]. An extensive computational analysis of molecular dynamics simulations by Zhang et al. [13] led to a potential HIV-1 RT RNase domain finding. However, the experiment was limited only to 77 α-hydroxytropolone derivatives, which limited the efforts of discovering novel small molecule inhibitors; hence, no large-scale extensive computational analysis with all-atom molecular dynamics simulations and some form the binding free energy calculation validating the potential lead compounds have been performed against the HIV-1 RT RNase domain.

In previous studies, MAUVE analysis was conducted to figure out the stable region of the SARS-CoV-2 genome to discover the potential drug candidates matching proteins playing a role in its virulence [14–16]. In this study, a similar approach was applied to determine the rationale of targeting the R.T. enzyme for the drug discovery and development efforts of anti-HIV-1 drugs, following the establishment of the R.T. enzyme as the best candidate for drug targeting, a dataset of around 94.000 small drug-like molecules was obtained from the ZINC15 database, structure-based virtual screening against the R.T. RNase domain was performed via molecular docking, the interaction, and dynamics of the top lead molecules were further validated using molecular dynamics. The general workflow of the study is illustrated in Fig. 1.

2.1 Phylogenetic tree

190 complete HIV-1 genomic sequences were retrieved from the Los Alamos HIV sequence database [17]; the sequences were manually selected such that only two sequences (where applicable) were chosen from each country, and the dataset included all the geographic regions available. Whole genome sequences converted to FASTA format had aligned using ClustalW, and a phylogenetic tree was constructed [18]. The tree was visualized using Ugene software v. 38. [19]. This study was carried out to estimate gene pool variation to elucidate common genomic regions representing different subtype diversity depending on the origin of selected genomes.

2.2 Whole-genome alignment and BLASTx

The dataset carried some bias towards certain regions. For instance, the U.S. received higher coverage for its size and the number of high-quality sequences deposited. In contrast, other countries like India, despite its population size, due to the limited number of high-quality sequences deposited, received lower coverage). Whole-genome alignment was performed via the progressive MAUVE algorithm with match seed weight set to automatic calculation, minimum Locally Collinear Blocks (LCB) set to default (3 times the minimum match size), progressive MUSCLE (v3.6) was selected as a gap aligner for each LCB, and minimum island size, maximum backbone gap size, minimum backbone size were set to 50 [20, 21]. The list of all sequences used for the alignment is included in Supplementary Data 1, and the whole genome alignment result is included in Supplementary Data 2. The alignment result was visualized in Geneious Prime (v2020.1), and the highest conserved continuous region with no gaps in the alignment was excised from the alignment and visualized separately in-depth [22]. A consensus identity sequence from the conserved fragment was generated using Jalview and submitted to NCBI BLASTx with the default parameters (max target sequences 100, expected threshold 0.05, word size 6, max match in a query range 0, matrix BLOSUM62, gap costs for existence 11, an extension of 1, and compositional adjustments via conditional compositional score matrix adjustment), the alignment for the excised fragments are provided in Supplementary Data 3, and the consensus sequence is provided in Supplementary Data 4 [23–25] that were analysed, and sky-blue shapes indicate the computational steps.

2.3 Molecular docking-based virtual screening

The experimental X-ray diffraction structure of HIV-1 RT with Protein Data Bank (PDB) I.D. 3IG1 was retrieved from the Research Collaboratory for Structural Bioinformatics (RCSB) PDB website [26, 27]. The missing residues from the structure were added via the PyMol's builder plugin (open source v2.5.0), the loop regions where the residues were added, which were refined using MODELLER (v10.1) [28–30]. The structure was then cleaned from all heteroatoms except for the cofactor atoms, polar hydrogens were added were necessary, and Kollman charges were computed [31]. A library of 94.545 annotated anodyne small molecules (ligands) that are stable at physiological pH and have a charge of 0, -1, or -2 was generated from the ZINC15 database [32]. A grid box with a size of 25 Å X 32 Å X 32 Å along the X, Y, and Z-axis was calculated (a box around the RNase H active site). Virtual screening was performed with HIV-1 RT structure against the ligand dataset within the grid box calculated at an exhaustiveness of 64 via AutoDock Vina (v1.1.2) [33]. The top 7 molecules with the highest affinity scores were screened again with the same configuration but with an exhaustiveness of 256, compounds that successfully reproduced their scores in the same pose were retained for further analysis.

2.4 Protein-ligand interaction profiling

The best dock pose of the top hit ligands was loaded with the HIV-1 RT to PyMol, and all the residues within 4 Å from the lead compounds were visualized (i.e., all potential, hydrophobic interactions, hydrogen bonds, and ionic interactions) and evaluated, the manually predicted bonds were also cross-validated with the TU Dresden's Protein-Ligand Interaction Profiler (PLIP) webserver and only overlapping interactions were considered [34].

2.5 Molecular dynamics simulation

Studies on docking do not take into account the protein's flexibility. To ensure complete binding mode and check for stability, Molecular dynamics (M.D.) should be evaluated. M.D. simulations are computational techniques that allow the study of the dynamic behavior of molecular systems over time. In this study, we used the academic version of the Maestro program for molecular dynamics (M.D.) simulations [35]. This way, we could examine the interactions, conformational changes, and system behavior between protein and ligand. In this study, we used M.D. simulations to investigate the stability and behavior of a protein-ligand complex. The protein of interest is PDB ID 3IG1, and we used four natural compounds (Table 1) as ligands for simulation. We aimed to determine the effect of ligands on the stability of the protein and also to determine which regions of the protein fluctuate the most. We used an NPT assembly with a temperature of 300 K and a simulation time of 100,102 ns to perform the M.D. simulations. The NPT ensemble is a type of molecular dynamics simulation that considers the system's particle number, pressure, and temperature. It allows the simulation of a system in a constant volume and temperature environment where pressure can fluctuate. This ensemble is useful in examining the behavior of systems in the solution and provides a more realistic simulation environment. We monitored the RMSD of both protein and ligand throughout the simulation.

We also calculated the RMSF of the protein to identify local changes along the protein chain. The RMSF values for each residue were monitored throughout the simulation, and the peaks allowed us to identify the protein areas that fluctuated the most during the simulation. In addition, we analyzed ligand contacts throughout the simulation to identify residues in the ligand-interacting protein. We marked the ligand-interacting residues with vertical bars in green. This allowed us to visualize which residues were involved in binding the ligand to the protein. We also monitored the secondary structure of the protein throughout the simulation. The secondary structure of a protein refers to local conformation patterns such as alpha helices and beta sheets. A protein's secondary structure can play an essential role in its function, and changes in secondary structure can affect the protein's activity. Finally, we reported the presence of counterions and salt concentration in the solvent medium. Counterions are ions added to a system to maintain charge neutrality, and their presence can affect the system's behavior. In conclusion, our M.D. simulation study provided insight into the stability and behavior of a protein-ligand complex under certain simulation conditions. We monitored the RMSD and RMSF of the protein and ligand, identified ligand-interacting residues, monitored the secondary structure of the protein, and reported the presence of counterions and salt concentration in the solvent medium.

2.6 Binding free energy calculation via MM/PBSA

The binding free energy (∆Gbind, Gibbs free energy) between the lead compounds and R.T. enzyme was calculated from the last 10 ns (stable RMSD interval) for each simulation system compromising of 1001 snapshots via MM/PBSA single trajectory protocol, the formula in Eq. (2) was followed to calculate the energy terms for each of the R.T., lead compound, and RT-lead complex using the CaFE plugin (v1.0) and finally Eq. (4) was used to calculate the ∆Gbind [36–38]. The equation derivation and approximations have been provided in Supplementary Data 5.

2.7 Analysis of calculated physicochemical and ADMET properties

The top lead compounds were submitted to the SwissADME webserver of the Swiss Institute of Bioinformatics to calculate their physicochemical properties and drug-likeness [39]. The lead compounds drug-likeness were evaluated based on 5 filters, Lipinksi rule of 5, Ghose filters, Veber filter, Egan filter, and Muegge filter [40–44]. For the ADMET analysis, admetSAR (which is also used by DrugBank to evaluate drugs) and ADMETlab (v2.0) webserver were collectively used to analyse each lead compound [45, 46].

2.8 Additional docking studies

To evaluate the anti-RNase H activity of Phomoarcherin B, the reverse transcriptase enzyme of Feline Immunodeficiency Virus (FIV), which also exhibits a similar DEDD motif (PDB: 5OVN) [47], was investigated using the same docking approach. Furthermore, docking of Phomoarcherin B with RNase H of Bacteriophage T4 (PDB: 1TFR) [48] and monomeric reverse transcriptase of Moloney murine leukemia virus (MLV, PDB: 4MH8) [49] was also investigated.

Also, to understand the binding affinity of Phomoarcherin B to a critical target structure, the transmembrane domain of HIV-1 gp41 (PDB: 5JYN) [50] was used as a receptor. To minimize energy on Phomoarcherin B, the steepest descent was constructed using Avogadro software considering the MMFF94 force field [51]. The missing residues from the structure and then cleaned from all heteroatoms except for the cofactor atoms, polar hydrogens were added, which were necessary, and Kollman charges were computed [31] as mentioned above. To predict the affinity of Phomoarcherin B to the transmembrane domain of HIV-1 gp41, blind docking was constructed using a grid box for the receptor coordinates fit within the following volume: X = 78.0, -17.2, -8.8, Y = 128.0, 19.3, 26.2, Z = 103.0, 1.1, 8.7. Ligand coordinates; X = 76.0, -28.2, -12.5, Y = 131.4, 21.5, 27.6, Z = 103.7, -3.4, 7.5, and grid box X, Y, Z dimensions; 55.4 Å, 49.7 Å, 40.1 Å, respectively. Molecular docking studies were conducted according to these values.

3.1 Phylogenetic tree

The phylogenetic tree of the HIV-1 sequences is given in Supplementary Data 6. The tree showed two main clusters involving sub-clusters which show variation depending on the origins of the subtypes. The subtypes with the same origin are placed in the same sub-cluster and branch based on the gene cluster. This data showed that HIV-1 variants have genetic diversity according to their geographical dissemination. This result provided the genomic base involving a large-scale variety to explore a common region in the HIV-1 genome.

3.2 Whole-genome alignment and BLASTx

The results from the whole-genome alignment are provided in Fig. 2. The alignment is visualized such that each LCB is clustered together, and the alignment is zoomed out so that each clustered LCB is shown as a continuous black bar, a region with a length of around 2.4 kb within the ≈ 2.8–5.3 kb range from the consensus sequence is highly conserved with almost no gaps in most of the sequences. In addition, after the alignment, using the progressive MAUVE algorithm enabled us to observe the longest highly conserved common regions within HIV-1 genomes aligned with the same gap region in the sequences (Supplementary Fig. 1, adenine in pink, guanine in yellow, thymine in green, and cytosine in blue, respectively).

The BLASTx results from the consensus sequence have indicated that the highly conserved region belongs to the HIV-1 pol gene (NCBI GenBank: QMX87928.1), hence nominating the functional proteins from the HIV-1 pol gene (R.T., IN, and late-phase protease) as a promising target for developing therapeutics, therefore, further therapeutic screening and analysis were performed on one of the main pol gene products, the reverse transcriptase enzyme [52].

3.3 Molecular docking-based virtual screening

The virtual screening of the ligand dataset against the HIV-1 RT enzyme nominated 7 compounds with significantly high affinities (≥ -8.5 kcal/mol), among them, only 4 compounds successfully achieved the same affinity in 3 subsequent runs, hence only these 4 compounds [(1) 4-hydroxy-3-[5-[5-(2-hydroxyphenyl)-1H-pyrazol-4-yl]-4,5-dihydro-1H-pyrazol-3-yl]chromen-2-one (C21H16N4O4), (2) Artoindonesianin P (C20H16O7), (3) 12a-Hydroxydolineone (C19H12O7), (4) Phomoarcherin B (C23H28O5)] were selected and further analysed, the top docking poses' for these 4 compounds are shown in Fig. 3A, a summary of each compound along with their chemical structures is given in Table 1.

Table 2

A summary of all interactions between the RNase H active site and the respective lead compounds is visualized in Fig. 3B.
Interacting compound	Interacting residue	Distance (Å)	Interaction type
Compound 1	A445	3.80	Hydrophobic
	Q500	3.69	Hydrophobic
	Y501	3.72	Hydrophobic
	S499	3.59	Hydrogen bond
	Q500	2.58	Hydrogen bond
	Y501	2.28	Hydrogen bond
	Mn²⁺	2.40	Ionic
	Mn²⁺	2.60	Ionic
Compound 2	Q475	3.62	Hydrophobic
	R448	2.64	Hydrogen bond
	Q500	2.47	Hydrogen bond
	Y501	2.03	Hydrogen bond
	Mn²⁺	2.50, 3.30*	Ionic
	Mn²⁺	2.50	Ionic
Compound 3	Q500	3.23	Hydrophobic
	H539	3.22	Hydrophobic
	Mn²⁺	2.21, 2.64*	Ionic
	Mn²⁺	2.73, 2.91*	Ionic
Compound 4	A445	3.85	Hydrophobic
	Q500	3.96	Hydrophobic
	A538	3.66	Hydrophobic
	Q500	2.50	Hydrogen bond
	Y501	2.25	Hydrogen bond
	Mn²⁺	2.41	Ionic
	Mn²⁺	2.60	Ionic
* Lead compounds making more than one interaction with the same Mn²⁺ cation have their distances mentioned within the same cell separated by a comma.

3.4 Protein-ligand interaction profiling

The interaction between the R.T.'s RNase H catalytic site with the divalent cation and the top 4 potential lead compounds (as shown in Fig. 3) was closely analysed and visualized, Fig. 3B (a-d) shows all the potential hydrophobic interactions (with yellow dashes) and hydrogen bonds (with magenta dashes) between each residue and lead compound within the RNase H active site, the interacting residues from the R.T. backbone are further expanded (stick representations in dark wild willow) to visualize the interacting atoms. The right columns in Fig. 3B (e-h) show all the potential interactions between the lead compounds and the cofactor Mn²⁺ cations (cyan beads, dark blue dashes), the residues D443, E478, D498, and D549 (DEDD motif) interacting with the cofactor cations are also expanded to visualize the proximity (sky blue sticks) of the interactions, a summary table of all interactions between the lead compounds within the RNase H active site has been listed in Table 2 along with their distances.

3.5 Molecular dynamics simulation

The stability of the interaction between the four compounds and HIV-1 Reverse Transcriptase with the Inhibitor beta-Thujaplicinol Bound at the RNase H Active Site was systematically investigated by simulation of 100ns molecular dynamics in Fig. 4. The RMSD values allowed us to determine if the simulation had stabilized and if there were any significant conformational changes. We found that the RMSD values of the protein and ligand stabilized after about 20 ns of simulation, indicating that the system has reached equilibrium. We found that some residues had higher RMSF values than others, suggesting that certain protein regions are more resilient. These highly flexible regions could play a role in ligand binding to the protein. When we traced the ligand contacts, we found that the ligands interacted with different residues in the protein, and the specific interactions varied depending on the ligand. When we examined the secondary structure of the protein throughout the simulation, we found that its secondary structure remained constant throughout the simulation without any significant change in the distribution of the secondary structural elements. Finally, in the results of the behavior of counterions on the system, we found that counterions were present in the solvent medium, and their concentrations remained relatively constant throughout the simulation. Our findings show that ligands interact with different residues in the protein, and the specific interactions vary depending on the ligand. We also found that certain protein regions are more flexible than others. As shown in Fig. 5, during the molecular dynamics simulation of the four complexes, the structure of HIV-1 Reverse Transcriptase with the Inhibitor beta-Thujaplicinol Bound at the RNase H Active Site showed a stable trend after 20 ns, and for compound 4-protein, the fluctuation of the RMSD value was significantly higher than for the other three compound groups. As seen in Fig. 6, higher RMSF values in compound 4 mean that the ligand moves more in that region. In this way, it was revealed that the interactions of compound 4 on the protein were higher.

3.6 Binding free energy calculation via MM/PBSA

Using the single trajectory approach for MM/PBSA calculation, the 8 energy terms (〖∆E〗_elec, 〖∆E〗_vdw, 〖∆G〗_PB, 〖∆G〗_SA, 〖∆G〗_gas, 〖∆G〗_sol, 〖∆G〗_pol, 〖∆G〗_npol) were calculated for the R.T. enzyme, lead compound, and RT-lead complex separately from each production simulation, and their total sum was used in Eq. (4) to calculate the binding free energy 〖∆G〗_(bind/mmpbsa), the sums of each energy term are provided in Table 3 along with their standard deviations, the values of 〖∆G〗_mmpbsa indicate the spontaneity of the interaction between the R.T. and lead compounds (i.e. more negative = more spontaneous). Detailed values for each energy term for the protein, ligand, and complex are provided separately provided in Supplementary Data 7.

Table 3

MM/PBSA energy terms for 〖∆G〗_complex calculated for each RT-lead pair of HIV-1, energies were calculated from the last 10 ns of the production trajectory using the single trajectory approach.
\({\varDelta E}_{elec}\)	\({\varDelta E}_{vdw}\)	\({\varDelta G}_{PB}\)	\({\varDelta G}_{SA}\)	\({\varDelta G}_{Gas}\)	\({\varDelta G}_{sol}\)	\({\varDelta G}_{pol}\)	\({\varDelta G}_{npol}\)	\({\varDelta G}_{mmpbsa}\)
				Compound1
-28.73 ± 3.76	-49.67 ± 3.40	41.65 ± 2.85	-5.10 ± 0.12	-78.40 ± 5.07	36.56 ± 2.81	12.92 ± 4.48	-54.77 ± 3.44	-41.84 ± 3.90
				Compound2
-3.68 ± 3.15	-50.89 ± 2.67	30.76 ± 3.84	-5.38 ± 0.10	-54.58 ± 3.97	25.38 ± 3.83	27.08 ± 4.11	-56.28 ± 2.64	-29.20 ± 4.80
				Compound3
-2.27 ± 1.23	-48.92 ± 2.66	13.21 ± 1.20	-4.90 ± 0.11	-51.20 ± 3.00	8.32 ± 1.18	10.94 ± 1.52	-53.818 ± 2.70	-42.88 ± 3.21
				Compound4
-2.83 ± 1.95	-54.86 ± 2.56	27.58 ± 3.64	-5.48 ± 0.09	-57.69 ± 3.02	22.11 ± 3.62	24.76 ± 4.22	-60.33 ± 2.54	-35.58 ± 4.84
± indicates the standard deviations. All values given energy values are for the difference between the complex and the sum of protein and ligand, \({\varDelta X}_{y}= {\varDelta X}_{y\left(complex\right)}-({\varDelta X}_{y\left(protein\right)}+{\varDelta X}_{y\left(ligand\right)})\). All values are in kcal/mol unit.

3.7 Analysis of calculated physicochemical and ADMET properties

The physicochemical properties of the 4 lead compounds are listed in Table 4 along with their drug-likeness results, all 4 lead compounds passed the Lipinksi rule of 5, Ghose filters, Veber filter, Egan filter, and Muegge filter without any violations. The ADMET profiles of each lead compound are also summarized in Table 5 based on the results from admetSAR and ADMETlab.

Table 4

Physicochemical and drug-likeness properties of the lead compounds based on the SwissADME results.
Physicochemical properties	Compound 1	Compound 2	Compound 3	Compound 4
Molecular weight (g/mol)	388.38	368.34	352.29	384.47
No. heavy atoms	29	27	26	28
No. aromatic heavy atoms	21	16	15	6
No. rotatable bonds	3	0	0	0
No. H-bond acceptors	6	7	7	5
No. H-bond donors	4	4	1	1
Log S (ESOL)	-4.15	-4.31	-3.92	-4.68
Solubility (mg/mL)	2.72e-02	1.82e-02	4.24e-02	8.12e-03
Solubility class*	Moderately soluble	Moderately soluble	Soluble	Moderately soluble
Lipophilicity (Log P_o/w)^×	2.30	2.42	2.15	3.67
Lipinksi rule of 5^#	Pass (0)	Pass (0)	Pass (0)	Pass (0)
Ghose filters^#	Pass (0)	Pass (0)	Pass (0)	Pass (0)
Veber filters^#	Pass (0)	Pass (0)	Pass (0)	Pass (0)
Egan filters^#	Pass (0)	Pass (0)	Pass (0)	Pass (0)
Muegge filters	Pass	Pass	Pass	Pass
*Based on the Log S (ESOL) scale, insoluble < -10 < poor < -6 < moderate < -4 < soluble < -2 < very < 0 < highly soluble. ^× The values are average of iLOGP, XLOGP3, WLOGP, MLOGP, and SILICOS-IT. ᶫ Based on the BOILED-Egg model [59]. ^# Numbers within parentheses indicate the number of violations of the respective filter/rule.

Table 5

ADMET profiles of the lead compounds as per results from admetSAR and ADMETlab.
ADMET properties	Compound 1	Compound 2	Compound 3	Compound 4
		Absorption
Gastrointestinal absorptionᶫ	High	High	High	High
Blood-brain barrier permeationᶫ	None	None	None	Yes
		Distribution
Plasma binding protein^×	96.58%	98.23%	95.53%	94.06%
Fraction unbound in plasma	2.52%	3.30%	5.60%	7.13%
		Metabolism
CYP450 2C9 Substrate	Non-substrate	Non-substrate	Non-substrate	Non-substrate
CYP450 2D6 Substrate	Non-substrate	Non-substrate	Non-substrate	Non-substrate
CYP450 3A4 Substrate	Non-substrate	Substrate	Non-substrate	Substrate
CYP450 1A2 Inhibitor	Inhibitor	Inhibitor	Non-inhibitor	Non-inhibitor
CYP450 2C9 Inhibitor	Inhibitor	Inhibitor	Non-inhibitor	Non-inhibitor
CYP450 2D6 Inhibitor	Non-inhibitor	Non-inhibitor	Non-inhibitor	Non-inhibitor
CYP450 2C19 Inhibitor	Inhibitor	Non-inhibitor	Non-inhibitor	Non-inhibitor
CYP450 3A4 Inhibitor	Inhibitor	Non-inhibitor	Non-inhibitor	Non-inhibitor
CYP Inhibitory Promiscuity	High	High	Low	Low
		Excreation^×
T_1/2 (hours)^¶	< 3 (0.349)	> 3 (0.52)	< 3 (0.113)	> 3 (0.70)
		Toxicity
Rat acute (LD₅₀, mol/kg)	2.43	2.38	2.41	2.71
TP* (pIGC₅₀, ug/L)	0.46	1.04	0.57	1.38
Acute oral (LD₅₀ mg/kg) ᶫ	III	III	III	III
Carcinogenicity	None	None	None	None
^× Based on the predictions of the ADMETLab 2.0 [46]. ^¶ Probability of half-life being greater than 3 h is given within parentheses, below 0.5 was considered to have T_1/2 < 3. * Tetrahymena pyriformis toxicity. ᶫ Class I ≤ 50 mg/kg, class II > 50 mg/kg, class III > 500 mg/kg, and class IV > 5000 mg/kg. The SMILES notation of the lead compounds was used as the input to calculate each property (provided in Supplementary Data 8).

3.8 Additional docking studies

FIV also exhibits a similar DEDD motif (PDB: 5OVN) and was investigated using the same docking approach similar binding poses with an affinity of -9.0 kcal/mol was observed in detailed results included in Supplementary Data 9. Furthermore, docking of Phomoarcherin B with RNase H of Bacteriophage T4 (PDB: 1TFR) and monomeric reverse transcriptase of MLV (PDB: 4MH8) produced affinities of -8.1 kcal/mol and − 8.3 kcal/mol respectively, further indicating the potency of Phomoarcherin B as a potential antiviral RNase H candidate (detailed log files of the docking experiment are in Supplementary Data 10 and Supplementary Data 11, respectively).

The highest affinity was − 5.08 kcal/mol with conducted docking analyses using local computational analysis of the transmembrane domain of HIV-1 gp41 (PDB: 5JYN) and Phomoarcherin B. These results were also confirmed using SwissDock server; the highest value was − 6.36 kcal/mol. The highest number of elements involving affinity as the cluster was considered (detailed log files of the docking experiment are in Supplementary Data 12).

As well known, HIV infections have kept on claiming lives ever since its emergence. While modern antiviral and HAART therapies provide some relief and support for patients, they also have disadvantages and limitations. This study aimed to perform an extensive computational analysis to discover and evaluate potent novel inhibitors of HIV-1 replication within the host by targeting the most coherent target, providing therapeutic options for the patients while accelerating the drug development processes by providing potential leads.

The mutation rate for HIV-1 has been reported to be 10 − 4 to 10 − 2 mutants/clones, and with the estimated production of 109 virions/day within an infected individual,

the virus mutates efficiently to develop resistance and evade the immune system [60]. However, not all these mutants are expected to survive and replicate, as mutations occurring on some genes could be lethal. The most coherent target for drug discovery and development efforts would be the phenotypes that mutate less frequently, as their chance of developing resistance or evasion is lower than their highly mutating counterparts. All this information is useful for determining regions with low mutation rates within the HIV-1 genome. The comparative genomics approach considers the correlations and differences between the genotype (genome) of closely related species or even different variants of the same specie to answer the reasons behind their characteristic phenotypes. This method has also been widely used to discover resistance genes in several bacterial genomes [53, 54].

In this study, we used high-quality HIV-1 sequences from the Los Alamos database and performed whole-genome alignment with MAUVE to indicate the regions within the HIV-1. The genome shows the highest level of consensus among all the selected sequences; as visualized in Fig. 2, a genomic fragment of around 2.4 kb was the longest genomic fragment (the green bars on top of the sequences in Fig. 2) with the least variation among the selected sequences. BLASTx results have given the pol gene encoded 3 essential functional proteins of the HIV-1; the viral reverse transcriptase, integrase, and late-phase protease, which are major targets for drug targeting. The FDA has already approved NNRTI and NRTI with inhibitory functions on the DNA polymerase activity of the R.T. Therefore, our study has focused on discovering and analysing potential R.T. RNase H inhibitors, critical for viral replication due to their polymerase activity [55–58].

The computational analysis reported in this paper and the lead selection criteria applied, compound 4 is the best-performing lead compound, with a docking score of -8.5 kcal/mol, several hydrophobic, hydrogen bond, and ionic interactions with active site residues of the HIV-1 RNase H and the cofactor Mn2 + cations, less than 1 Å deviation from its initial docked pose throughout the 30 ns molecular dynamic simulation, binding free energy of ≈ -35.58 ± 4.84 kcal/mol and near-perfect scores on each ADMET profile. Further confirming its potential RNase H inhibitory activity, the R.T. enzyme's backbone RMSD also reached a plateau following the 20 ns time-lapse of the production simulation [59, 60].

Compound 4 (Phomoarcherin B) is a natural compound found in the endophytic fungus Phomopsis archeri, and it was first isolated and characterized as a pentacyclic aromatic sesquiterpene via spectroscopic analysis by Hemtasin et al. [61]. They were tested for antimalarial activity against Plasmodium falciparum and anticancer activities against cholangiocarcinoma cell lines. Bedi et al. (62) also stated the anticancer activity, whereas no in vitro or in vivo assay was performed regarding its antiviral or RNase H inhibitory activity. As known, major antiretroviral drugs cannot penetrate through the blood-brain barrier, which reduces their efficiency against HIV in the brain and results in the formation of reservoirs [62]. As shown in Table 5, Phomoarcherin B can penetrate the blood-brain barrier, increasing its drug-candidate potential. However, further in vitro assays and clinical trials are needed to confirm its pharmaceutical potential as an HIV-1 RNase H inhibitor.

We suggest that the additional docking studies of Phomoarcherin B with FIV and monomeric reverse transcriptase of MLV belong to the Retroviridae family and RNase H of Bacteriophage T4, which belongs to Myoviridae family, have also provided promising results. Moreover, assumed anti-RNase H activity will provide more knowledge on the effect of Phomoarcherin B with additional in vitro and in vivo studies (details provided in Supplementary Data 9, Supplementary Data 10, and Supplementary Data 11) (Fig. 5). However, we should stress that computational studies are not sufficient alone to show the inhibitory effect of our suggested candidate compound. Even the affinity of Phomoarcherin B to nucleic acid could be higher than the other three compounds over the active site. We also need some experimental studies and future works for validation.

The results showed the affinity of Phomoarcherin B for a critical target protein with the transmembrane domain of HIV-1 gp41 confirmed by docking server analysis (Supplementary Data 12) (Fig. 8a).

The genetic diversity was explored by selecting the sequences that could represent the whole world's geographical dissemination considering a large-scale pool of HIV-1 sequences. Then, the common longest gene region in the HIV-1 genome was found by MAUVE analysis. Our findings indicated that as a common region, the pol gene encompassed the HIV-1 reverse-transcriptase RNase H enzyme. Afterward, the compounds specific to this region were screened from the database, and four promising candidate compounds were found. Among these molecules, a fungal metabolite of Phomopsis archeri, Phomoarcherin B, was assessed using data mining methods against HIV-1 reverse-transcriptase RNase H. Then, molecular docking and molecular dynamism analyses were carried out to investigate the interaction of this molecule with the HIV-1 reverse-transcriptase enzyme. We have also predicted that as an effective molecule, how n number of Phomoarcherin B molecules inhibits the transmembrane domain of HIV-1 gp41 using computational biology methods. Our studies have shown that Phomoarcherin B can potentially disrupt the HIV-1 replication cycle, suggesting it is a candidate molecule that could be tested in clinical studies.

Author contributions

N.A.G.: formal analysis; writing - original draft preparation. K.K.K.: conceptualization; investigation; formal analysis; funding acquisition; writing – review & editing - original draft preparation. Ö.B.: conceptualization; investigation; formal analysis; writing - original draft preparation. B.E.S.: investigation; formal analysis. R.S.S.: conceptualization; formal analysis; writing - original draft preparation. All authors commented on previous versions and read and approved the final manuscript.

J. Hemelaar, "The origin and diversity of the HIV-1 pandemic," (in eng), Trends Mol Med, vol. 18, no. 3, pp. 182-92, Mar 2012, doi: 10.1016/j.molmed.2011.12.001.
WHO. "HIV data and statistics." https://www.who.int/teams/global-hiv-hepatitis-and-stis-programmes/hiv/strategic-information/hiv-data-and-statistics (accessed 2023).
UNAIDS. "Global HIV & AIDS statistics " https://www.unaids.org/en/resources/fact-sheet (accessed 2023).
M. S. Cohen, N. Hellmann, J. A. Levy, K. DeCock, and J. Lange, "The spread, treatment, and prevention of HIV-1: evolution of a global pandemic," (in eng), J Clin Invest, vol. 118, no. 4, pp. 1244-54, Apr 2008, doi: 10.1172/jci34706.
F. Kirchhoff, "HIV Life Cycle: Overview," 2013, pp. 1-9.
C. M. Swanson and M. H. Malim, "SnapShot: HIV-1 proteins," (in eng), Cell, vol. 133, no. 4, pp. 742, 742.e1, May 16 2008, doi: 10.1016/j.cell.2008.05.005.
E. Fanales-Belasio, M. Raimondo, B. Suligoi, and S. Buttò, "HIV virology and pathogenetic mechanisms of infection: a brief overview," (in eng), Ann Ist Super Sanita, vol. 46, no. 1, pp. 5-14, 2010, doi: 10.4415/ann_10_01_02.
D. S. Ruelas and W. C. Greene, "An integrated overview of HIV-1 latency," (in eng), Cell, vol. 155, no. 3, pp. 519-29, Oct 24 2013, doi: 10.1016/j.cell.2013.09.044.
P. A. Volberding and S. G. Deeks, "Antiretroviral therapy and management of HIV infection," (in eng), Lancet, vol. 376, no. 9734, pp. 49-62, Jul 3 2010, doi: 10.1016/s0140-6736(10)60676-9.
V. Poongavanam and J. Kongsted, "Virtual Screening Models for Prediction of HIV-1 RT Associated RNase H Inhibition," PLOS ONE, vol. 8, no. 9, p. e73478, 2013, doi: 10.1371/journal.pone.0073478.
Y. Shin et al., "Identification of Aristolactam Derivatives That Act as Inhibitors of Human Immunodeficiency Virus Type 1 Infection and Replication by Targeting Tat-Mediated Viral Transcription," (in eng), Virol Sin, vol. 36, no. 2, pp. 254-263, Apr 2021, doi: 10.1007/s12250-020-00274-7.
G. Poli, C. Granchi, F. Rizzolio, and T. Tuccinardi, "Application of MM-PBSA Methods in Virtual Screening," (in eng), Molecules, vol. 25, no. 8, Apr 23 2020, doi: 10.3390/molecules25081971.
B. Zhang, M. P. D’Erasmo, R. P. Murelli, and E. Gallicchio, "Free Energy-Based Virtual Screening and Optimization of RNase H Inhibitors of HIV-1 Reverse Transcriptase," ACS Omega, vol. 1, no. 3, pp. 435-447, 2016/09/30 2016, doi: 10.1021/acsomega.6b00123.
Ö. Baysal, N. Abdul Ghafoor, R. S. Silme, A. N. Ignatov, and V. Kniazeva, "Molecular dynamics analysis of N-acetyl-D-glucosamine against specific SARS-CoV-2’s pathogenicity factors," PLOS ONE, vol. 16, no. 5, p. e0252571, 2021, doi: 10.1371/journal.pone.0252571.
Ö. Baysal, R. Silme, C. Karaaslan, and A. Ignatov, "Genetic uniformity of a specific region in SARS-CoV-2 genome and repurposing of N-acetyl-D-glucosamine," Fresenius Environmental Bulletin, vol. 30, pp. 2848-2857, 02/22 2021.
B. Ömür and S. Ragıp Soner, "Utilization from Computational Methods and Omics Data for Antiviral Drug Discovery to Control of SARS-CoV-2," in SARS-CoV-2 Origin and COVID-19 Pandemic Across the Globe, K. Vijay Ed. Rijeka: IntechOpen, 2021, p. Ch. 4.
C. Kuiken, B. Korber, and R. W. Shafer, "HIV sequence databases," (in eng), AIDS Rev, vol. 5, no. 1, pp. 52-61, Jan-Mar 2003.
J. D. Thompson, D. G. Higgins, and T. J. Gibson, "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice," (in eng), Nucleic Acids Res, vol. 22, no. 22, pp. 4673-80, Nov 11 1994, doi: 10.1093/nar/22.22.4673.
K. Okonechnikov, O. Golosova, and M. Fursov, "Unipro UGENE: a unified bioinformatics toolkit," (in eng), Bioinformatics, vol. 28, no. 8, pp. 1166-7, Apr 15 2012, doi: 10.1093/bioinformatics/bts091.
A. C. Darling, B. Mau, F. R. Blattner, and N. T. Perna, "Mauve: multiple alignment of conserved genomic sequence with rearrangements," (in eng), Genome Res, vol. 14, no. 7, pp. 1394-403, Jul 2004, doi: 10.1101/gr.2289704.
R. C. Edgar, "MUSCLE: a multiple sequence alignment method with reduced time and space complexity," BMC Bioinformatics, vol. 5, no. 1, p. 113, 2004/08/19 2004, doi: 10.1186/1471-2105-5-113.
Geneious Prime 2023.1. (2023). [Online]. Available: https://www.geneious.com
A. M. Waterhouse, J. B. Procter, D. M. Martin, M. Clamp, and G. J. Barton, "Jalview Version 2--a multiple sequence alignment editor and analysis workbench," (in eng), Bioinformatics, vol. 25, no. 9, pp. 1189-91, May 1 2009, doi: 10.1093/bioinformatics/btp033.
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic local alignment search tool," (in eng), J Mol Biol, vol. 215, no. 3, pp. 403-10, Oct 5 1990, doi: 10.1016/s0022-2836(05)80360-2.
D. J. States and W. Gish, "Combined use of sequence similarity and codon bias for coding region identification," (in eng), J Comput Biol, vol. 1, no. 1, pp. 39-50, Spring 1994, doi: 10.1089/cmb.1994.1.39.
D. M. Himmel et al., "Structure of HIV-1 reverse transcriptase with the inhibitor beta-Thujaplicinol bound at the RNase H active site," (in eng), Structure, vol. 17, no. 12, pp. 1625-1635, Dec 9 2009, doi: 10.1016/j.str.2009.09.016.
H. Berman, K. Henrick, and H. Nakamura, "Announcing the worldwide Protein Data Bank," (in eng), Nat Struct Biol, vol. 10, no. 12, p. 980, Dec 2003, doi: 10.1038/nsb1203-980.
B. Webb and A. Sali, "Comparative Protein Structure Modeling Using MODELLER," (in eng), Curr Protoc Bioinformatics, vol. 54, pp. 5.6.1-5.6.37, Jun 20 2016, doi: 10.1002/cpbi.3.
A. Fiser, R. K. Do, and A. Sali, "Modeling of loops in protein structures," (in eng), Protein Sci, vol. 9, no. 9, pp. 1753-73, Sep 2000, doi: 10.1110/ps.9.9.1753.
The PyMOL molecular graphics system. (015). [Online]. Available: https://pymol.org/2/
G. M. Morris et al., "AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility," (in eng), J Comput Chem, vol. 30, no. 16, pp. 2785-91, Dec 2009, doi: 10.1002/jcc.21256.
T. Sterling and J. J. Irwin, "ZINC 15 – Ligand Discovery for Everyone," Journal of Chemical Information and Modeling, vol. 55, no. 11, pp. 2324-2337, 2015/11/23 2015, doi: 10.1021/acs.jcim.5b00559.
O. Trott and A. J. Olson, "AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading," (in eng), J Comput Chem, vol. 31, no. 2, pp. 455-61, Jan 30 2010, doi: 10.1002/jcc.21334.
S. Salentin, S. Schreiber, V. J. Haupt, M. F. Adasme, and M. Schroeder, "PLIP: fully automated protein-ligand interaction profiler," (in eng), Nucleic Acids Res, vol. 43, no. W1, pp. W443-7, Jul 1 2015, doi: 10.1093/nar/gkv315.
Schrödinger Release 2022-3: Maestro (2021). New York, NY. [Online]. Available: https://www.schrodinger.com/products/maestro
H. Liu and T. Hou, "CaFE: a tool for binding affinity prediction using end-point free energy methods," Bioinformatics, vol. 32, no. 14, pp. 2216-2218, 2016, doi: 10.1093/bioinformatics/btw215.
T. Hou, J. Wang, Y. Li, and W. Wang, "Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations," (in eng), J Chem Inf Model, vol. 51, no. 1, pp. 69-82, Jan 24 2011, doi: 10.1021/ci100275a.
N. Singh and A. Warshel, "Absolute binding free energy calculations: on the accuracy of computational scoring of protein-ligand interactions," (in eng), Proteins, vol. 78, no. 7, pp. 1705-23, May 15 2010, doi: 10.1002/prot.22687.
A. Daina, O. Michielin, and V. Zoete, "SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules," Scientific Reports, vol. 7, no. 1, p. 42717, 2017/03/03 2017, doi: 10.1038/srep42717.
C. A. Lipinski, F. Lombardo, B. W. Dominy, and P. J. Feeney, "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings," (in eng), Adv Drug Deliv Rev, vol. 46, no. 1-3, pp. 3-26, Mar 1 2001, doi: 10.1016/s0169-409x(00)00129-0.
A. K. Ghose, V. N. Viswanadhan, and J. J. Wendoloski, "A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. A qualitative and quantitative characterization of known drug databases," (in eng), J Comb Chem, vol. 1, no. 1, pp. 55-68, Jan 1999, doi: 10.1021/cc9800071.
D. F. Veber, S. R. Johnson, H. Y. Cheng, B. R. Smith, K. W. Ward, and K. D. Kopple, "Molecular properties that influence the oral bioavailability of drug candidates," (in eng), J Med Chem, vol. 45, no. 12, pp. 2615-23, Jun 6 2002, doi: 10.1021/jm020017n.
W. J. Egan, K. M. Merz, and J. J. Baldwin, "Prediction of Drug Absorption Using Multivariate Statistics," Journal of Medicinal Chemistry, vol. 43, no. 21, pp. 3867-3877, 2000/10/01 2000, doi: 10.1021/jm000292e.
I. Muegge, S. L. Heald, and D. Brittelli, "Simple selection criteria for drug-like chemical matter," (in eng), J Med Chem, vol. 44, no. 12, pp. 1841-6, Jun 7 2001, doi: 10.1021/jm015507e.
F. Cheng et al., "admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties," (in eng), J Chem Inf Model, vol. 52, no. 11, pp. 3099-105, Nov 26 2012, doi: 10.1021/ci300367a.
G. Xiong et al., "ADMETlab 2.0: an integrated online platform for accurate and comprehensive predictions of ADMET properties," (in eng), Nucleic Acids Res, vol. 49, no. W1, pp. W5-w14, Jul 2 2021, doi: 10.1093/nar/gkab255.
M. Galilee and A. Alian, "The structure of FIV reverse transcriptase and its implications for non-nucleoside inhibitor resistance," PLOS Pathogens, vol. 14, no. 1, p. e1006849, 2018, doi: 10.1371/journal.ppat.1006849.
M. Bhagwat, D. Meara, and N. G. Nossal, "Identification of Residues of T4 RNase H Required for Catalysis and DNA Binding*," Journal of Biological Chemistry, vol. 272, no. 45, pp. 28531-28538, 1997/11/07/ 1997, doi: https://doi.org/10.1074/jbc.272.45.28531.
D. Das and M. M. Georgiadis, "The Crystal Structure of the Monomeric Reverse Transcriptase from Moloney Murine Leukemia Virus," Structure, vol. 12, no. 5, pp. 819-829, 2004/05/01/ 2004, doi: https://doi.org/10.1016/j.str.2004.02.032.
J. Dev et al., "Structural basis for membrane anchoring of HIV-1 envelope spike," (in eng), Science, vol. 353, no. 6295, pp. 172-175, Jul 8 2016, doi: 10.1126/science.aaf7066.
Avogadro: an open-source molecular builder and visualization tool. (2022). [Online]. Available: https://avogadro.cc/
A. P. Waterson, "Acquired immune deficiency syndrome," (in eng), Br Med J (Clin Res Ed), vol. 286, no. 6367, pp. 743-6, Mar 5 1983, doi: 10.1136/bmj.286.6367.743.
P. E. Fournier et al., "Comparative genomics of multidrug resistance in Acinetobacter baumannii," (in eng), PLoS Genet, vol. 2, no. 1, p. e7, Jan 2006, doi: 10.1371/journal.pgen.0020007.
R. C. Hardison, "Comparative genomics," (in eng), PLoS Biol, vol. 1, no. 2, p. E58, Nov 2003, doi: 10.1371/journal.pbio.0000058.
E. De Clercq, "Non-nucleoside reverse transcriptase inhibitors (NNRTIs): past, present, and future," (in eng), Chem Biodivers, vol. 1, no. 1, pp. 44-64, Jan 2004, doi: 10.1002/cbdv.200490012.
R. W. King, R. M. Klabe, C. D. Reid, and S. K. Erickson-Viitanen, "Potency of nonnucleoside reverse transcriptase inhibitors (NNRTIs) used in combination with other human immunodeficiency virus NNRTIs, NRTIs, or protease inhibitors," (in eng), Antimicrob Agents Chemother, vol. 46, no. 6, pp. 1640-6, Jun 2002, doi: 10.1128/aac.46.6.1640-1646.2002.
E. De Clercq, "Perspectives of non-nucleoside reverse transcriptase inhibitors (NNRTIs) in the therapy of HIV-1 infection," (in eng), Farmaco, vol. 54, no. 1-2, pp. 26-45, Jan-Feb 1999, doi: 10.1016/s0014-827x(98)00103-7.
G. L. Melikian et al., "Non-nucleoside reverse transcriptase inhibitor (NNRTI) cross-resistance: implications for preclinical evaluation of novel NNRTIs and clinical genotypic resistance testing," (in eng), J Antimicrob Chemother, vol. 69, no. 1, pp. 12-20, Jan 2014, doi: 10.1093/jac/dkt316.
J. Huang et al., "CHARMM36m: an improved force field for folded and intrinsically disordered proteins," (in eng), Nat Methods, vol. 14, no. 1, pp. 71-73, Jan 2017, doi: 10.1038/nmeth.4067.
S. Jo, T. Kim, V. G. Iyer, and W. Im, "CHARMM-GUI: a web-based graphical user interface for CHARMM," (in eng), J Comput Chem, vol. 29, no. 11, pp. 1859-65, Aug 2008, doi: 10.1002/jcc.20945.
C. Hemtasin et al., "Cytotoxic pentacyclic and tetracyclic aromatic sesquiterpenes from Phomopsis archeri," (in eng), J Nat Prod, vol. 74, no. 4, pp. 609-13, Apr 25 2011, doi: 10.1021/np100632g.
O. Osborne, N. Peyravian, M. Nair, S. Daunert, and M. Toborek, "The Paradox of HIV Blood-Brain Barrier Penetrance and Antiretroviral Drug Delivery Deficiencies," (in eng), Trends Neurosci, vol. 43, no. 9, pp. 695-708, Sep 2020, doi: 10.1016/j.tins.2020.06.007.

Table 1 is available in the Supplementary Files section.

No competing interests reported.

Download PDF

Journal Publication

published 10 Aug, 2023

Read the published version in Molecular Diversity →

Editorial decision: Major revision
31 May, 2023
Editor assigned by journal
31 May, 2023
Submission checks completed at journal
31 May, 2023
First submitted to journal
30 May, 2023

You are reading this latest preprint version

Data mining using artificial intelligence and molecular dynamics analysis to detect HIV-1 reverse transcriptase RNase H activity inhibitor

Status:

Journal Publication

Version 1

Abstract

Figures

1 Introduction

2 Materials and Methods

2.1 Phylogenetic tree

2.2 Whole-genome alignment and BLASTx

2.3 Molecular docking-based virtual screening

2.4 Protein-ligand interaction profiling

2.5 Molecular dynamics simulation

2.6 Binding free energy calculation via MM/PBSA

2.7 Analysis of calculated physicochemical and ADMET properties

2.8 Additional docking studies

3 Results and Discussion

3.1 Phylogenetic tree

3.2 Whole-genome alignment and BLASTx

3.3 Molecular docking-based virtual screening

3.4 Protein-ligand interaction profiling

3.5 Molecular dynamics simulation

3.6 Binding free energy calculation via MM/PBSA

3.7 Analysis of calculated physicochemical and ADMET properties

3.8 Additional docking studies

Conclusion

Declarations

Author contributions

References

Table

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1