Comparative modeling and enzymatic affinity of novel haloacid dehalogenase from Bacillus megaterium strain BHS1 isolated from alkaline Blue Lake in Turkey

Abstract This study presents the initial structural model of L-haloacid dehalogenase (DehLBHS1) from Bacillus megaterium BHS1, an alkalotolerant bacterium known for its ability to degrade halogenated environmental pollutants. The model provides insights into the structural features of DehLBHS1 and expands our understanding of the enzymatic mechanisms involved in the degradation of these hazardous pollutants. Key amino acid residues (Arg40, Phe59, Asn118, Asn176, and Trp178) in DehLBHS1 were identified to play critical roles in catalysis and molecular recognition of haloalkanoic acid, essential for efficient binding and transformation of haloalkanoic acid molecules. DehLBHS1 was modeled using I-TASSER, yielding a best TM-score of 0.986 and an RMSD of 0.53 Å. Validation of the model using PROCHECK revealed that 89.2% of the residues were located in the most favored region, providing confidence in its structural accuracy. Molecular docking simulations showed that the non-simulated DehLBHS1 preferred 2,2DCP over other substrates, forming one hydrogen bond with Arg40 and exhibiting a minimum energy of −2.5 kJ/mol. The simulated DehLBHS1 exhibited a minimum energy of −4.3 kJ/mol and formed four hydrogen bonds with Arg40, Asn176, Asp9, and Tyr11, further confirming the preference for 2,2DCP. Molecular dynamics simulations supported this preference, based on various metrics, including RMSD, RMSF, gyration, hydrogen bonding, and molecular distance. MM-PBSA calculations showed that the DehLBHS1-2,2-DCP complex had a markedly lower binding energy (−21.363 ± 1.26 kcal/mol) than the DehLBHS1-3CP complex (-14.327 ± 1.738 kcal/mol). This finding has important implications for the substrate specificity and catalytic function of DehLBHS1, particularly in the bioremediation of 2,2-DCP in contaminated alkaline environments. These results provide a detailed view of the molecular interactions between the enzyme and its substrate and may aid in the development of more efficient biocatalytic strategies for the degradation of halogenated compounds. Communicated by Ramaswamy H. Sarma


Introduction
The production of specialized enzymes by microorganisms could remove pollutants from the environment.These pollutants such as oils, solvents and pesticides served as energy sources for most microbes, and it was cheaper and more sustainable.The implementation of microbial processes to decontaminate the toxic chemicals would require a greater understanding of why and how microorganisms could produce specialized enzymes that can eliminate and transform these toxic chemicals into the carbon source for their survival as well as for the clean atmosphere (Landa-Acuña et al., 2020).Determination of the 3D structure of a protein molecule is key to the understanding of its mechanism and functions and it is dependent on the primary sequence of the amino acid.Several studies have demonstrated the protein structure of haloacid dehalogenase enzymes including DehIVa of Burkholderia cepacia MBA4 (Schmidberger et al., 2005), (DhlB of Xanthobacter autotrophicus GJ10 (Ridder et al., 1995), L-HAD of Sulfolobus tokodaii 7 (Rye et al., 2009), L-DEX from Pseudomonas sp.YL (Hisano et al., 1996), DehD/DehL, and DehE of 2-haloacid dehalogenase from Rhizobium sp.RC1 (Adamu et al., 2016;Hamid et al., 2013;Sudi et al., 2014).Many studies were conducted for bacteria growth on 2-haloacid as a carbon source in the search for dehalogenases and their corresponding genes (Cairns et al., 1996;Edbeib et al., 2020a;2020b;Fortin et al., 1998;Ismail et al., 2018;Oyewusi et al., 2020a;Stringfellow et al., 1997).However, the discovery of new dehalogenases and their properties is still the highlighted area of research and deserves further study.
Chlorinated aliphatic compounds (such as 2,2-dichloropropionic acid (2,2-DCP), 3-chloropropionic acid (3-CP)) represent one of the most vital groups of industrially produced chemicals, which are recalcitrant and prevails for long durations in the environment.The 2,2-DCP is also known as Dalapon, and 3CP; which synthetic carboxylate ion is the active ingredient in herbicides.2,2-DCP and 3CP are respectively classified as a-and b-chlorinated alkanoic acid and are known to be environmental pollutants.These compounds are hazardous to humans due to their toxicity, and their effects are exacerbated by their recalcitrance and bioaccumulation ability (Oyewusi et al., 2020b(Oyewusi et al., , 2021a)).
The dehalogenase enzyme belongs to the family of hydrolases that specifically act on carbon-halide bonds (Muslem et al., 2017;Nardi-Dei et al., 1997).Dehalogenation processes have been categorised into various forms depending on their substrate specificity and the configuration of the products (Harisna et al., 2017;Huyop et al., 2004;2008;Slater et al., 1995).Furthermore, two classes of unique haloacid dehalogenases that have been best characterized represented Group I and II haloacid dehalogenases on the basis of their different intermediate catalytic substrate mechanisms (Ang et al., 2018;Schmidberger et al., 2007).Groups I and II exhibit non-stereospecific and stereo-specific dehalogenation, respectively.Group II or L-type enzymes include L-2-haloacide dehalogenases (i.e.L-DEXs) specifically targeting L-2-haloacids (Wang et al., 2018).The D-type dehalogenase enzyme is less common in nature than the L-type enzyme (Thasif et al., 2009).The dehalogenation of 2-haloacids leading to the liberation of halogen ions and the production of 2-hydroxy acids.
Recently, Bacillus megaterium strain BHS1 was isolated from Blue Lake (Mavi G€ ol), Turkey, which is able to utilize 2,2-dichloropropionic acid as a sole source of carbon and energy.Hence, it can be hypothesized that BHS1 contains a dehalogenase gene viable for bioremediation.To gain specific alkaloadaptation insights into the novel stability dehalogenase from the newly isolated Bacillus megaterium BHS1 is to understand the protein structure of the isolated bacterium.Currently, no report demonstrates the prediction of a full-length modelling structure of haloacid dehalogenase from Bacillus megaterium sp.BHS1, therefore, this study aims to perform computational modelling of putative haloacid dehalogenase enzyme from Bacillus megaterium BHS1 to provide comprehensive information on the structure and the mechanistic role of this enzyme.

Source of dehalogenase gene
Bacillus megaterium strain BHS1 produces dehalogenases owing to its ability to grow on a minimal medium containing 2,2-DCP as the primary carbon source (Wahhab et al. 2020).The genomic analysis of a functional haloacid degrading gene in Bacillus megaterium strain BHS1 was performed from its full genome sequence.Genes linked to the dehalogenase with Locus tag have been located within the genome of Bacillus megaterium strain BHS1 (Table 1).The Cof-type HAD-IIB family hydrolase was distantly linked to dehalogenase as reported by (Nemati et al., 2013) and therefore, haloacid dehalogenase type II was further characterized and designated as DehLBHS1.

The 3D-structure prediction
The 3D structure of DehLBHS1 was developed based on the sequence alignment with one or more template proteins of known structure using a multi-template threading method in I-TASSER programmed tools (Yang & Zhang, 2015), SWISS-MODEL Automatic Comparative Protein Modelling Server website (https://swissmodel.expasy.org)and the neural network AlphaFold called AlphaFold2 (https://www.ebi.ac.uk/Tools/sss/ fasta/ or https://alphafold.ebi.ac.uk/) were used to build the DehLBHS1 model template, based on the architecture of a homologous protein.I-TASSER employs various threading algorithms to provide structural templates generated by interactive fragment assembly simulations.In contrast, SWISS-MODEL is a homology (or comparative) protein structure modeling method for generating reliable and accurate 3-D models of proteins that share significant sequence similarity with proteins of known structure.A set of five models was created and the proposed design was chosen based on the confidence score.The dehLBHS1 sequence was aligned with sequences from 3 top threading templates by I-TASSER and SWISS-MODEL.SWISS-MODEL employs significant sequence similarity with proteins of known structure.Meanwhile, the AlphaFold2 software predicts the protein's 3D structure from its amino acid sequence by incorporating physical and biological knowledge about protein structure into designing the deep learning algorithm and utilizing multi-sequence alignments.This software is shown to have good accuracy, and the output is largely compatible with experimental ones (Jumper et al., 2021;Senior et al., 2020).

Refinement and evaluation of the model
Molecular dynamics-based techniques have been used to refine the models protein (Feig, 2017).MD simulation of DehLBHS1 in apo-form was performed on the GROMACS 5.1 by GROMACS 54a7 force field.MD simulation was carried out under a constant temperature and pressure of 300K and 1 atm, respectively, and energy minimization using the steepest descent of 515 steps with conjugate gradient methods.This implies that the system was energy minimized using the steepest descent algorithm up to a maximum 515 steps or until the maximum force (Fmax) was less than 1000 kJ mol À 1 nm À 1 (the default threshold).This step removes steric clashes or inappropriate geometry in the solvated protein-ligand complex system.The production simulation was triplicated, where the DehLBHS1 protein was placed in a defined cubic box with a 1.0 nm minimum distance between the solutes and the box's edge.The solutes were solvated using the simple point charge (SPC) water model, and seven sodium ions were added to the protein simulation box to neutralize the system.The equilibrated structure was simulated for 100 ns with an integration time step of 2 fs.A molecular dynamic (MD) simulation was performed on the model protein to analyze the physical movements of atoms and molecules for a fixed period (Fan & Mark, 2004).The MD simulation produces a good outcome by drawing the structural models closer to the original structures.The simulation was carried out for DehLBHS1 enzyme, which was positioned in a solvated cubic simulation box.The structure was equilibrated for 50 picoseconds with an integration time of 2 fs.The protein's stability was observed by the root square displacement (RMSD).The VMD software version 1.9.2 was used to visualize the structure of DehLBHS1 from the MD trajectory (Humphrey et al., 1996).
PROCHECK, Verify3D, and ERRAT were used to assess the validity of the stabilized model in terms of stereo-chemical abnormalities, sequence fitness to structure, and non-bonded interaction traits (Pontius et al., 1996).Ramachandran plot with marked secondary structure elements and an example of steric distortion was generated using PROCHECK.

Docking calculations and active site determination
The active-site residues of DehLBHS1 were predicted using COACH-server to produce interdependent ligand binding sites predictions depending on the target DehLBHS1.COACH developed complementary ligand binding site predictions using two comparison approaches based on the supplied protein structure.I-TASSER was to create 3D models, which were then analyzed into the COACH process to predict ligand-binding sites.ClustalW 2.1 was then used to identify the catalytic triad and highly conserved residues of the protein.
Discovering regions of significant sequence similarity is critical based on maximizing the alignment score, which may is a gap penalty (Khersonsky et al., 2018;Xu et al., 2017).Catalytic residues were then shown in PyMOL by superimposing the expected active site residues on the template structure (Batumalaie et al., 2018).
Organic pollutants such as 3CP, 2,2DCP, D2CP, and L2CP (Figure 1) found in Alkalophilic settings are tougher to degrade naturally since traditional microbiological methods do not work well at high pH.Bioremediation of such environment requires high pH-tolerant microbes, namely the alkalotolerant.While such enzymes' potential applications are well established, their uses in pollutant degradation in extreme environments are shown to be difficult.MD simulations under a high pH environment could overcome the challenges by atomic-level computationally deciphering such biomolecular systems' structure, dynamics, and function/adaptation.Docking calculations of DehLBHS1 ligands (3CP, 2,2DCP, D2CP, and L2CP) on the receptor-binding site were performed using AutoDockTools4 (Morris et al., 2009).In order to completely identify the biological binding site, grid maps of 48 Å � 52 Å � 54 Å grid points with a grid spacing of 1.0 Å were developed by AutoGrid (Morris et al., 2009) and centered along the conserved active site residues for 3CP, 2,2DCP, D2CP, and L2CP.The optimal pose was chosen among the various structural positions based on the native AutoDock score.AutoDockTools4 comprises AutoDock4, and AutoDock Vina are used by this study.The AutoDock version 4.2.6 was used for molecules preparation (i.e. protein and ligands preparation) and molecular docking between every single ligand and DehLBHS1 was conducted using AutoDock Vina, which combines certain advantages of knowledgebased potentials and empirical scoring functions.It extracts empirical information from the protein-ligand complexes' conformational preferences and the experimental affinity measurements.Additionally, to see if the refinement had any impact on the contacts created between the protein and the ligand using AutoDock Vina after simulation, we exposed the best simulated and refined protein DehLBHS1structure to interaction analysis (Isa et al., 2022;Salentin et al., 2015).

Parameterization and molecular dynamic (MD) simulation of DehLBHS1-3CP and 2,2DCP complex
Parameterization is the process of obtaining force-field parameters that will describe how ligands should behave in an MD simulation while Protein-Ligand Docking can be used for predicting the binding mode of small organic molecules to their target as well as providing an estimate of the binding affinity.Enzymes were treated as rigid molecules in molecular docking.This rigorous treatment can be carried out successfully with non-flexible proteins.DehLBHS1 is a flexible enzyme with a 36.11instability index; consequently, ligand binding can encourage more conformation changes in the DehLBHS1 structure (Shen et al., 2013).Therefore, the calculation for conformational changes was needed to produce an extra accurate representation.During DehLBHS1substrate interactions, the molecular dynamic simulation was performed on each DehLBHS1-3CP and À 2,2DCP complex and non-complex structural of DehLBHS1.
The MD modeling approach does not consider tiny molecules like L-2CP.The force field parameters (3CP and 2,2DCP) must be determined separately to perform the MD simulation for the DehLBHS-substrate complex.As a result, 3CP and 2,2DCP were parametrized using the Automated Topology Builder and Repository (Basri et al., 2015) version 3.0 (Malde et al., 2011).The DehLBHS1-substrate complex was subjected to energy minimisation with the steepest descent and conjugate gradient method utilized to minimize energy.Steepest Descents converged in 869 stages to Fmax < 1000.MD simulation of the complexes was performed using GROMOS 54a7 force field in a dodecahedron box contain 13,630 SPC/E water molecules and 3 ions of chloride ions counter for thirty nanosecond (Van Der Spoel et al., 2005).

Calculation of binding free energies
Lastly, using the g mmpbsa tool in the GROMACS package, and the Adaptive Poisson-Boltzmann Solver (APBS), the binding free energies calculations of the DehLBHS1 protein and substrates were performed (Kumar et al., 2014).Molecular Mechanics Poisson-Boltzmann Surface Area (MM-PBSA) interaction is precisely estimated using the g mmpbsa package, a standalone program made up of subroutines from GROMACS and APBS packages.Each DehLBHS1-substrate complex's binding free energy was determined in accordance with information from earlier similar investigations (Oyewusi et al., 2020b(Oyewusi et al., , 2021b)).

Results and discussion
Alkalotolerant Bacillus megaterium BHS1 can degrade many halogenated compounds, especially a high concentration of 2,2 DCP (40 mM) under alkaline (pH 9) conditions like in soda lake needs further investigation.As a result, it is important to study the protein structure with the main catalytic residues of DehLBHS1 that allow the bacterium to survive and adapt to a high concentration of 2,2 DCP under an extreme environment.The DehLBHS1 enzyme maintains its function in a highly alkaline environment and remains sufficiently stable to bind to the degradable substrates.The 3D structure details describing the catalytic action of DehLBHS1 against the various substrates and structure dynamics were closely monitored.The 3D model of the enzyme was structurally predicted and attempt to illustrate the potential alkalophilic adaptations of novel DehLBHS1.Several characteristics of this enzyme demonstrate its structural stability.The data will provide valuable insights into the adaptation of the DehLBHS1 structure for the extension of its substrate's catalytic action and its future specificity.
A detailed analysis of the novel protein DehLBHS1 is needed to understand the structure-function relationship of this enzyme that is isolated from the high alkalinity environment.However, the absence of an X-ray structure of alkalineadapted dehalogenase from alkalophilic bacteria has made interpreting the experimental data almost impossible.Therefore, using in silico method, the structure of DehLBHS1 was modelled to clarify the characteristics of this alkalineadapted protein.

Comparative modeling of DehLBHS1
The protein structure folds should be predictable solely from its amino acid sequence, but it is challenging in practice.Alternatively, template-based or comparative modeling adopts the experimental structure of a sequence relative (e. g. a homolog) as a template for the reconstruction of the 3D structure of the target protein.Protein modeling prediction facilitates structural understanding by providing quickly actionable structural hypotheses.Previous structural prediction studies utilized several modeling tools such as the SWISS-MODEL Repository, I-TASSER, Genome3D, and others.This software allowed researchers access to large numbers of structures and encouraged their free use by the community.Likewise, AlphaFold2 is another bioinformatics tool designed for protein structural prediction, including other functions, such as protein design, function annotation, disorder prediction, and domain identification and classification.Several studies have used this tool with good scale and accuracy for structural predictions (Jumper et al., 2021;Senior et al., 2020).In this work, we investigate the protein's monomer subunit to explain the functional affinity of the enzyme towards haloacid compounds.
The sequence identity between DehLBHS1 and the PDB template is limited.When dealing with low sequence identity, it has been shown that the sequence-structure alignment technique (threading) produces a more reliable model compared to sequence-sequence alignment, as seen in the homology modelling technique.Moreover, using multiple models enhances alignment coverage, as well as model accuracy and quality (Larsson et al., 2008).Consequently, the 3D structure of DehLBHS1 was constructed using a multitemplate threading approach through I-TASSER (Yang & Zhang, 2015).I-TASSER employs various threading algorithms to provide structural templates, whereas SWISS-MODEL relies on significant sequence similarity with proteins of known structure.
The design of DehLBHS1 model was based on the reference structures of L-2-haloacid dehalogenases (PDB IDs: 1QH9, 1ZRN and 3UM9).Among the modeling tools employed, I-TASSER, SWISS-MODEL, and Alphafold determined L-2-Haloacid from Pseudomonas sp.YL (1QH9) as the most suitable template, with a similarity percentage of 40%.The I-TASSER-generated DehLBHS1 model showed the highest performance, with a TM-score of 0.986 and an RMSD of 0.53 Å, suggesting that the predicted 3D conformation is highly favorable.This model was selected for subsequent analysis, as illustrated in Figure 2.
In general, most dehalogenases were reported in dimeric form.The homodimeric proteins allow many functions, such as improved stability and specificity of the substrate (Marianayagam et al., 2004).The functional domain or active site location is away from the dimer interfaces in all experimental 3D structures of L-2-haloacid dehalogenases (Adamu et al., 2017;Hisano et al., 1996).In the in-silico investigation, the monomer version of the protein was also employed to establish the function of several active site residues (Nakamura et al., 2009).In our study, DehLBHS1 was modelled in monomeric form using several low-similarity templates.Interestingly, the topology of the enzyme structure is highly similar with other related haloacid dehalogenases (S1).Protein sequences are not as conservative as their structures.Hence, many proteins can shear a similar fold even if they lack high sequence similarities.

Molecular structure refinement of DehLBHS1
Predicted models are not entirely reliable therefore, energy minimisation is necessary to refine the enzyme model and to strengthen the structural model without major errors according to their native state (Park et al., 2018).This requires to use of Molecular Dynamic (MD) simulation that is helpful for protein refinement (Adamu et al., 2016;Fan & Mark, 2004).Model refining raises the accuracy of homology-based and ab initio models (Fan & Mark, 2004) and minimizing energy alleviates structural problems, including steric collisions and static structure stresses.Conversely, the structure refined by MD simulation enables atoms to migrate within the boundaries of a specified force field to establish the global energetic minimum configuration (Adamu et al., 2017).In the context of energy minimisation of the aforementioned optimal DehLBHS1 model (C-score 1.23), its stability during the MD simulation was gauged via RMSD backbone atoms as a time function, as illustrated in (S2).Following the energy minimisation, the DehLBHS1 structure stabilized at RMSD 0.25 Å and equilibrated throughout the simulation.This signifies that the reference structure has moved to some degree to accomplish stability.Equilibration was achieved within the plateau phase of simulation and molecular adaptation was seen over the simulation period.The 3D form of the DehLBHS1 has been evaluated based on RMSD.This RMSD is to measure the average shift by displacing several atoms for a single frame in terms of the reference point.MD simulations are used not only for refinement, but to analyze biological functions represented in the protein conformation, new molecule shape, and structural features, by observing their internal motion ( � Sled� z & Caflisch, 2018).
The evaluation of the refined model's quality and the Ramachandran plot for the pre-and post-refinement model was depicted in Table 2 by PROCHECK.Additionally, PROCHECK performed stereochemical checks of the model, which depicts the distribution of the phi and psi angles of the amino acid residues.This showed that 89.2% of the residues are in the most favored regions, while 10.8% are in additional allowed regions, and none of the residues are in both the generously allowed regions and the disallowed regions.This suggested the structure model of DehLBHS1 with acceptable quality since most of the residues are in the favorable region (Figure 3).The results of the Verify3D analyses showed that 83.5% of the amino acid residues had an average score of >2.Overall, 100% of the amino acid residues resided in their favorite or allowable regions.Residues with a score exceeding 0.2 in Verify3D can be considered precise.The model determined satisfactory for proteins has a Verify3D ranking of over 80% (Rosdi et al., 2018).The fact that the Verify3D pre and post minimisation models exceeded zero denotes that sufficient side-chain environments have been achieved in Table 2.All the evaluation outcomes from the DehLBHS1 enhancement surpassed the minimum score limit, which signifies an effective 3D structure.Lastly, assessed by ERRAT based on non-bonded interactions, the overall model quality scored 96.17%.It is known that the spectrum for a successful protein model has been acknowledged when the ERRAT score is > 50% (Rosdi et al., 2018).After exceeding the minimum cut-off score, all the evaluation results from the DehLBHS1 refinement indicate that the model has an acceptable stereochemical consistency and a good 3D structure.

Molecular docking of DehLBHS1 with haloalkanoic acids
The significant role of the dehalogenation process can be elucidated using the interaction of catalytic residues with halogenated compounds as substrate.Molecular docking, with its computational consideration, helps to explain the relationship between the enzyme and the ligand molecule.
In turn, offer somewhat insight into the enzyme degradative mechanism (Lemmon & Meiler, 2013).DehLBHS1 docking was performed using the AutoDock Vina software.Four different ligands of haloacids were used for molecular docking with DehLBHS1 of Bacillus megaterium BHS1, which were 3-CP, 2,2-DCP, L-2CP, and D-2CP.Mostly, the relevant docking of the protein-ligand complex allows for predicting the preferred orientation when linked together to form a stable protein-substrates complex (Bahaman et al., 2021;Oyewusi et al., 2021b).The lowest energy (kcal/mol) represents the highest affinity of the substrate toward the enzyme (Mishra et al., 2019).An overall observation, DehLBHS1-ligand complexes, the hydrogen bond distances have remained under acceptable distance limits for the creation of intermolecular hydrogen bonds (< Å 3.5) (Fu et al., 2018).This reflects a higher affinity between the enzyme and substrate to become more tendency to catalyse the compound.Interestingly, the DehLBHS1-3CP complex showed the lowest binding energy (-3.8 kJ/mol).However, none of the hydrogen bonds formed.The DehLBHS1 with all ligands (2,2DCP, 2-L-2CP, 2-D-2CP) established one hydrogen bond as in Table 3. Table 3 displays a list of docking that includes essential information on DehLBHS1-ligand complexes with hydrogen bond lengths and binding energies (kcal/mol).The docking of DehLBHS1with 2,2-DCP, L-2CP, and D-2CP showed similar interaction levels.Docking of the DehLBHS1 with each 2,2-DCP complex showed the moderate binding energy estimated at À 2.5 kcal/mol compared with the D,L-2CP and 3CP at À 3.5 and À 3.8 kcal/mol, respectively.
It is to highlight that the DehLBHS1-3CP complex with the least binding energy does not form any hydrogen bond interaction.This indicates that the DehLBHS1-3CP complex cannot create polar interaction with the nearby residues.However, Asp9, Tyr11, Ile44, Phe59, Asn118, Asn176, and Trp178 provide weak interactions by hydrophobic contact with the substrate (Figure 4A).Docking of DehLBHS1 with 2,2-DCP offers a hydrogen bond formed between N2 of the guanidine side chain group of Arg40 and the oxygen atom of the 2,2DCP carboxylic group (1.8 Å) as in Figure 4(B).DehLBHS1 with 2,2-DCP offers the shortest hydrogen bond distance than other substrates represented by DehLBHS1-D2CP and DehLBHS1-L2CP.Additionally, 7 close contact residues such as Asp9, Tyr11, Phe59, Lys150, Asn118, Asn176, and Trp178 were shown to provide interaction with 2,2-DCP.Therefore, the number of binding residues for 2,2-DCP is the highest compared to other substrates.Correspondingly, the docking of the DehLBHS1-D2CP complex and DehLBHS1-L2CP complex showed a hydrogen bond formed between the amino (N) group of Asn118 and the oxygen (O) atom of D-& L-2CP carboxyl (1.9 Å).Both complexes had substrate interaction with 6 similar residues, namely Tyr11, Arg40,  (a,b,l,p) 11.8 10.8 Generously allowed regions (⁓a, ⁓b, ⁓l, ⁓p) 1.5 0.0 Residue in disallowed regions 0.0 0.0 Non-glycine and non-proline residues 100.0 100.0  DehLBHS1-3CP complex À 3.8 ---DehLBHS1-2,2DCP complex À 2.5 1 Arg40 1.8 Å DehLBHS1-D2CP complex À 3.5 1 Asn118 1.9 Å DehLBHS1-L2CP complex À 3.5 1 Asn118 1.9 Å Phe59, Lys150, Asn176, and Trp178 (Figure 4C), and L-2CP has two additional interacting residues, which are Ile44 and Thr117 (Figure 4D).The simulated and refined DehLBHS1 reflects a higher affinity between the enzyme and substrate to become more tendency to catalyse the compound.Interestingly, the DehLBHS1-3CP complex showed the lowest binding energy (À 4.7 kJ/mol) with two hydrogen bonds formed (Table 4), as compared with non-simulated DehLBHS1 that shows none of the hydrogen bonds formed (Table 3).Whereas the simulated-DehLBHS1 with all ligands (2,2DCP, 2-L-2CP, 2-D-2CP) established a 2-4 hydrogen bond interaction (Table 4).Generally, the docking of the simulated-DehLBHS1 and ligands revealed that all the substrates could create polar  interaction by forming 2-4 hydrogen bonds with the nearby residues.Additionally, most residues (Asp9, Tyr11, Ile44, Phe59, Asn118, Asn176, Lys150, and Trp178) provide weak interactions by hydrophobic contacts in non-simulated DehLBHS1-ligand complexes showed the formation of hydrogen bonds with the simulated-and refined-DehLBHS1 (Figure 5A-D).Importantly, all DehLBHS1-ligand complexes exhibited hydrogen bond lengths that were within the acceptable cut-off distances to generate intermolecular hydrogen bonds (<3.5 Å) (Fu et al., 2018).On the other hand, an enzyme has a lower affinity for a substrate and is less likely to catalyze the molecule when the hydrogen bond distance between them is greater than 3.5 Å.Therefore, all the hydrogen bond distances formed between the simulated-DehLBHS1-ligand complexes fell within the acceptable range (Table 4 and Figure 5A-D).
Thus, this study somehow confirms the preliminary experimental study by about the efficiency and specificity of the DehLBHS1 of strain BHS1 to degrade 2,2 DCP (2,2-dichloropropionic acid) and stereoselectivity preference for L-2CP substrate.We also performed multiple sequence alignments with dehalogenase enzymes used as top threading templates by I-Tasser.Sixty-one conserved residues were identified during the alignment, and five residues were related to the binding interactions identified from the docking (Figure 6).We conclude that the Asn188 and Ala40 are important residues that form hydrogen bonds with 2,2-DCP and D,L-2CP, and three residues namely Phe59, Asn176, and Trp178, are involved in most of the interactions.All these five residues were identical to the related dehalogenases from Pseudomonas sp.YL and Polaromonas sp. by previous studies (Hisano et al., 1996;Nakamura et al., 2009;Zakary et al., 2022).However, the study selected 2,2-DCP and 3-CP for MD simulation based on docking analysis of 2,2-DCP having the highest binding energy with one hydrogen bond.Whereas 3-CP was chosen because it shows the lowest binding without forming hydrogen bonds with non-simulated DehLBHS1 protein.In addition, because the simulation was conducted without the presence of water, molecular docking cannot adequately depict an enzyme substrate-ligand interaction.A more trustworthy investigation is molecular dynamics simulation since it considers the role that water plays in the system (Anuar et al., 2020;Bahaman et al., 2021;Oyewusi et al., 2021c)

Molecular dynamics of DehLBHS1-haloalkanoic acids
MD simulation provides comprehensive information on the dynamics and flexibility of protein when bound to various ligands or substrates.Furthermore, MD simulations reveal the possibility of the protein-ligand complex achieving structural stability (Lee et al., 2015).This study concentrates on molecular adjustments of DehLBHS1 entailing the stability and flexibility in association with degradable and non-degradable substrates 2,2-DCP and 3-CP, respectively.Typically, this is calculated by RMSD, RMSF, the radius of gyration, and the hydrogen bonds formation.
The RMSD analysis of the protein backbone was determined to explain the conformational modifications of the two distinct ligands.In particular, the minimum value of the (RMSD �0.2-�0.3nm) means the protein complex is in a good stability state (Anuar et al., 2020;Oyewusi et al., 2020cOyewusi et al., , 2021c)).In this research, Molecular Dynamics (MD) trajectory simulation was closely monitored for degradable (2,2DCP) and non-degradable (3CP) substrates for about 100 nanoseconds (ns), as in Figure 7(a).
In comparison, the apo-formed DehLBHS1 depicts an RMSD value of �0.2-�0.25 nm, while the complex of DehLBHS1-2,2DCP shows a lower deviation of ~0.16 nm, especially between 10 and 80 ns (Figure 7a).While the complex of DehLBHS1-3CP shows a bit higher variation of about 0.35 nm before the complex reach a plateau at 40 ns.As shown from the results, the MD trajectory of the DehLBHS1-3CP complex was rather irregular and fluctuated more often than the DehLBHS1-2,2DCA complex.
According to the docking analysis, this indicates that 3CP is a less preferred ligand for the DehLBHS1, even though it has the lowest binding energy.For better clarity, the RMSD of the substrates in the active site was also calculated and clearly showed a more stable deviation pattern of about 1.1 nm for 2,2-DCP compared to 3CP (�1.3 nm), as shown in S3.At this stage, we can see that the 2,2DCP is the more favorable substrate to interact with the DehLBHS1 compared with the overall molecular motion of the 3CP substrate.
It is necessary to look at the local flexibility and residues contributing to the protein dynamics.Root means square deviations (RMSF) gauge the specific residue flexibility or the degree of certain individual residue movements (fluctuates) during a simulation.To note, the level of movement calculated by RMSF portrays the structural stability of the protein (Bahaman et al., 2021;Kumar et al., 2014).The structural fluctuation of the protein backbone and the sidechains can  result in conformational adjustments.Likewise, the above changes could impact the preferred structural limitations of the substrates at the active site (Flannelly et al., 2015).For our findings, the RMSF pattern in Figure 8(a) for all complexes represented by DehLBHS1-2,2DCP and DehLBHS1-3CP were had similar values and indicated reasonable stability throughout 100 ns.The N-terminal has the highest fluctuation, indicating a high flexibility region.RMSF plot contains five regions with peaks higher than 0.15 nm, probably related to the loop structures' greater flexibility (Ruvinsky et al., 2012).Interestingly, the residues existing in the core domain area have low RMSF values compared to the surface with exposed loops (Fuentes et al., 2018).
It is also observed that the DehLBHS1-2,2DCP ligand complex has a low value for the RMSD, this, in turn, indicates that the ligand was tightly bound to the enzyme.Thus, in the MD simulations, the stability of the halogenated organic ligand bound DehLBHS1 complexes was noted.Contrariwise, the higher value of RMSD, the weaker bonding of the substrate with the DehLBHS1, as represented in the DehLBHS1-3CP complex.
The active site RMSF is just as important as the calculated RMSF plot for the DehLBHS1-substrates 2,2DCP and 3CP complexes.Therefore, the atomic fluctuation level for these only substrates 2,2DCP and 3CP were calculated.These findings show that the Chloride (Cl) fluctuation for 3CP (0.5 Å) has a greater value than Cl fluctuated for 2,2DCP, as shown in Figure 8(b).It may be due to the 2,2DCP substrate having two chloride atoms bound to the alpha-carbon compared with 3CP, which has one atom of chloride at beta-carbon and is distant from the carboxyl group, thus making it more flexible.2,2DCP has lower RMSF values than 3CP, indicating a more stable conformation in the vicinity of the binding site, even though both substrates show a minimum distance from the active site residues.The chloride in 2,2-DCP is in good orientation and could readily be cleaved by any water molecule for a hydrolytic mechanism.
This study also tracked the Rg (Gyration) value of the DehLBHS1-substrate complex to calculate the degree of compaction and to monitor the total dimensions of the structures during the MD simulation.Rg corresponds to the massweighted, relatively constant Rg value representing a stable folded structure, while the unfolded structure allows the Rg value to shift through simulation (Liao et al., 2014).The highest peak of an Rg plot is formed by amino acids being packed more loosely, whereas the lowest is due to tighter packing (Lobanov et al., 2008).
For a 100 ns simulation time at 300 K, the plot of Rg of Calpha atoms shows 2,2-DCP, 3CP, and DehLBHS1 in apo form.The range of Rg values of DehLBHS1-2,2DCP, DehLBHS1-3CP, and DehLBHS1-apo is from 1.66 to 1.77 nm, as in Figure 9. Compared to the apo form, the complex of DehLBHS1-2,2DCP is more compact around �1.68 nm, and DehLBHS1-3CP is less compacted around �1.75 nm (Figure 9), to accommodate the substrates by the induced-fit conformation.For the bounded state, DehLBHS1-3CP shows tighter than 2,2DCP along the Gyration (Rg).The different values can be observed between the first and last 20 nanoseconds.Currently, the 3CP has a higher Rg value than 2,2DCP and apo form, indicating the adaptability of the active site to orient the substrate in the functional domain.Secondary, tertiary, and quaternary structures of proteins are formed by hydrogen bonds, which are thought to be the primary constituents of biomolecular structures.The loss of hydrogen bonding can obstruct proper folding, which can majorly affect structural integrity.Hydrogen bonds are essential in molecular recognition as well as the overall stability of the protein structure (Baig et al., 2014).In the water environment, the presence of hydrogen bonds is essential for protein-ligand binding, particularly whenever hydrolysis is necessary to complete the reaction process.In this study, the observed pattern of substrate and protein hydrogen bond number of 2,2DCP-DehLBHS1 is seen to be more consistent, indicating that hydrogen bond is constantly formed through the last 15, 50, and 80 ns compared to 3CP, indicating that hydrogen bond is continuously formed throughout 100 ns (Figure 10a,b).The close distance values between the substrate and DehLBHS1 supported the hydrogen bond formation.Notably, the 2,2DCP and 3CP had similar molecular distances in the range of 1.5-3.0Å and interacted closely with the enzyme over 100 nanoseconds (S4).

MM-PBSA binding free energy calculation
The binding power among DehLBHS1 and the tested substrates (2,2-DCP and 3CP) were measured based on van der Waals, electrostatic, polar, nonpolar solvation energies, and binding free energy through MM-PBSA calculation on the MD trajectories (Kumar et al., 2014).Compared to molecular docking, where the system is more rigid since no water molecules are present, estimation using MM-PBSA provides a more accurate prediction on an enzyme-substrate complex.As a result, the stability and flexibility of protein-ligand binding are restricted to just specific motions, and as a result, it is not a true reflection of how an enzyme and a substrate interact in the wild.Additionally, when the protein-ligand complex interchanges successfully over a 100 ns production simulation duration, the energy generated from MD trajectories is more sensitive and adaptable.In this research, the DehLBHS1-2,2-DCP complex had the best binding energy (DG binding ) (À 21.363±1.26kcal/mol), whereas DehLBHS1-3CP had the least binding energy (À 14.327±1.738)(Table 5).As can be seen, the MM-PBSA calculation supported the results of the MD simulation and molecular docking analysis, which showed that the simulated-DehLBHS1 model revealed the lowest binding energy with the formation of four hydrogen bonds, while the 3CP model revealed the lowest binding energy with no hydrogen bond formation.
This data generally demonstrated DehLBHS1's ability to degrade both substrates.However, it favored 2,2-DCP to 3CP.Furthermore, because the calculations are based on the best MD simulation trajectories of all enzyme-substrate complexes, MM-PBSA estimations are more precise than AutoDock Vina (Wang et al., 2018).The calculations also consider the role played by water molecules in the substrates' dehalogenation by DehLBHS1.Overall, it can be said that the DehLBHS1 preferred to degrade 2,2-DCP to 3CP.Notably, the MM-PBSA results highlighted the significant substrate specificity of the DehLBHS1.

Conclusion
This study presents the first structural model of DehLBHS1, an L-haloacid dehalogenase isolated from the halotolerant bacterium Bacillus megaterium BHS1.The model reveals key amino acid residues, including Arg40, Asn118, Phe59, Asn176, and Trp178, that play critical roles in the binding and stabilization of substrates, and are conserved across other dehalogenases.The model also shows that DehLBHS1 shares a similar architecture with distant dehalogenases.The molecular docking simulations and MM-PBSA calculations demonstrate that DehLBHS1 has a preference for 2,2DCP and D,L-2CP, which form stable complexes with the enzyme through intermolecular bonds and low binding energy.Molecular dynamics simulations further confirm the preference for 2,2DCP based on various metrics.Ligand binding alters the conformation, stability, and flexibility of an enzyme, and the design of an ideal ligand can expand DehLBHS1's substrate specificity and catalytic activity.Rational design of the enzyme, including the selection of an ideal ligand, will be crucial for expanding its potential in bioremediation.Overall, this study provides crucial insights into the structural features and enzymatic mechanisms of DehLBHS1, and may aid in the development of more efficient biocatalytic strategies for the degradation of halogenated pollutants.

Table 1 .Figure 1 .
Figure 1.Structures of halogenated compounds assessed by this study.

Figure 3 .
Figure 3.The Ramachandran plot of the polypeptide basis torsion angle psi against phi of amino acid existing in the DehLBHS1 structures (generated on https:// saves.mbi.ucla.edu/).(a) Pre and (b) Post minimisation.

Table 2 .
Model evaluation of DehLBHS1 before and after refinement using different tools.

Table 3 .
Binding energy and hydrogen bonds of DeLBHS1 docked complexes.

Table 4 .
Binding energy and hydrogen bonds of simulated and refined DehLBHS1 docked complexes.

Table 5 .
Binding free energies from MM-PBSA in kcal/mol for DehLBHS1-substrates complexes.