Proteins are engaged in highly selective interactions in micro to macro living systems. Variation (Mutation) in the sequence causes significant perturbations or complete abolishment of function, potentially leading to diseases. There is an important need to understand the impacts of variation in the protein structure. The stability of proteins plays an important role in characterizing their functions, activity and regulation [1].
One of the possible ways to assess the effect of a mutation on protein binding affinity/stability is to experimentally measure it. However, these methods can be time-consuming and costly. With the advancements and amalgamation of computing technology with chemistry, physics, and biology, it has become convenient to estimate the impact of mutations on protein stability/energy theoretically with near accuracy to the experimental results[2].
The current era of genome sequencing has unravelled a large number of human genetic variations, many of which may affect protein binding and function. [3]
Protein stability refers to the ability of a protein to maintain its native three-dimensional structure under a given set of conditions. The Gibbs free energy (ΔG) is a thermodynamic parameter that describes the tendency of a system to change spontaneously from one state to another. In the context of protein stability, ΔG is a measure of the free energy difference between the folded (native) and unfolded (denatured) states of the protein [4].
A negative ΔG value indicates that the protein is stable in its folded state, while a positive ΔG value indicates that the protein is unstable and has a tendency to unfold. The magnitude of ΔG reflects the strength of the interactions that stabilize the folded protein, such as hydrogen bonds, hydrophobic interactions, and electrostatic interactions [5].
Experimental techniques such as protein folding assays, circular dichroism spectroscopy, and differential scanning calorimetry can be used to measure protein stability and ΔG values under various conditions, such as changes in temperature, pH, and ionic strength. Computational methods such as molecular dynamics simulations and free energy calculations can also be used to predict protein stability and ΔG values based on the protein's structure and environmental conditions [6].
AIDS pandemic, caused by the retrovirus HIV-1, has claimed more than 30 million lives over the past four decades. Antiretroviral (ART), which is required for the whole life, has transformed the disease into a little manageable one. The CD+ T lymphocyte is the main target cell through which HIV-1 enters, by binding to its receptor CD4 and to the co-receptors i.e., CC-chemokine receptor-5 (CCR5). The fusion of the viral and human cell membranes, prompted by this binding, initiates a complex intracellular life cycle, producing new viruses. [7].
Computational Chemistry is a multidisciplinary field that combines principles of chemistry, physics, and computer science to investigate and understand chemical phenomena using computational methods. It involves the development and application of theoretical models, algorithms, and software tools to study various aspects of molecular systems, such as their structures, properties, and reactivity. Computational chemistry is a highly sophisticated branch of chemistry that uses computer simulations and mathematical models to study chemical systems. It involves the use of theoretical methods, algorithms, and computer programs to estimate the properties and behaviour of molecules, materials, chemical reactions etc.
The use of computational methods in chemistry has revolutionized the way researchers approach the study of molecules and materials. It enables the exploration of complex chemical systems that are often difficult or even impossible to study experimentally. Computational chemistry techniques provide insights into molecular interactions, reaction mechanisms, and properties of compounds, helping researchers to design new drugs, catalysts, and materials.
Computational chemistry has many applications, including drug discovery, materials science, catalysis, and environmental chemistry. By using computational methods, the properties of molecules and materials can be predicted to near accuracy without the need for expensive and time-consuming experiments. This helps in saving time thereby faster and more efficient development of new drugs, materials, and technologies.
Computational chemistry is a broader field that encompasses a wide range of computational methods and techniques used to study chemical systems. In addition to MD simulations and protein modelling, computational chemistry also includes techniques such as quantum chemistry, molecular mechanics, and molecular docking, among others. [8]
Some of the commonly used computational chemistry methods include computer aided drug design (CADD) including, molecular mechanics, quantum mechanics, density functional theory, and molecular dynamics simulations. These methods vary in their level of accuracy and computational cost and are chosen based on the specific research question and available computational resources.
Overall, computational chemistry plays an important role in advancing our understanding of chemical systems and developing new technologies that can improve our lives.
Computer-aided drug design (CADD) is a computational approach that involves the use of computer algorithms and software to assist in the drug discovery process. This approach uses various computational tools to identify potential drug candidates and optimize their properties before they are tested in the laboratory. [9]
CADD has become an essential tool in drug discovery, allowing researchers to rapidly screen large numbers of compounds and optimize their properties before investing time and resources in expensive experimental studies.
Virtual screening is a computational technique used to predict the potential activity of small molecules (ligands) against a specific target protein. It involves the use of computer software to analyse large databases of molecules and predict their affinity and activity for a specific target. It can be used in drug discovery to identify potential drug candidates that can bind to the target protein and modulate its activity [10]. It is a powerful tool in drug discovery as it can significantly reduce the time and cost involved in the drug discovery process by identifying potential drug candidates with high affinity and specificity for the target protein.
Molecular Dynamics (MD) simulation is a computational technique used in computational chemistry to study the behaviour of atoms and molecules over time [11]. In an MD simulation, the system of interest is described by a set of equations of motion that define the behavior of each atom or molecule in the system. The equations of motion take into account the interactions between atoms or molecules, which are described by a potential energy function. MD simulations can be used to study a wide range of chemical and biochemical systems, including proteins, DNA, and small molecules. They can provide insights into the dynamics and thermodynamics of these systems, such as the conformational changes that occur in proteins and the binding of ligands to enzymes. The simulation proceeds by solving the equations of motion numerically, typically using a numerical integration method such as the Verlet algorithm or the leapfrog algorithm [12]. The simulation calculates the position, velocity, and acceleration of each atom or molecule at each time step, and the positions of the atoms or molecules are updated based on these calculations.
Molecular dynamics (MD) simulations are one common type of simulation used in this field. MD simulations involve the use of computational models to simulate the motion of atoms and molecules over time. In the context of protein modelling, MD simulations can be used to study the structural and dynamic properties of proteins, including their folding and unfolding processes, interactions with ligands, and conformational changes. [13]
Protein modelling is the process of predicting the three-dimensional structure of a protein from its amino acid sequence. The three-dimensional structure of a protein is essential to understanding its function, interactions, and biochemical properties. There are several methods used to model protein structures, including homology modelling, ab initio modelling, and molecular dynamics simulations.
Homology modelling assumes that the amino acid sequence of a protein is similar to that of a known protein with a similar function and structure [14]. In homology modelling, the known protein structure is used as a template to predict the structure of the target protein. The accuracy of homology modelling depends on the similarity between the amino acid sequences of the target protein and the template protein.
Ab initio modelling, also known as de novo modelling, is a method that predicts the structure of a protein without using a template structure. Ab initio modelling is based on physical principles such as energy minimization and can be computationally expensive. This method is more challenging than homology modelling but can be used for proteins that do not have a close homolog with a known structure. [15]
Protein modelling is an essential tool for understanding protein function and structure. It has applications in drug design, protein engineering, and understanding the mechanisms of protein-protein interactions. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. The mutation causes a change in the stability of a protein.
DUET online server is used for these computations. DUET consolidates two reciprocal approaches (mCSM and SDM) in a agreement vaticination, attained by combining the results of the separate styles in an optimized predictor using Support Vector Machines (SVM) [16]. The system improves the overall delicacy of the prognostications in comparison with either system collectively and performs as well as or better than analogous styles. DUET is a bioinformatics web garçon created for gaining sapience into the goods of nsSNPs on protein stability. It integrates two reciprocal styles into a agreement/ optimized vaticination, as a way to work the stylish of SDM, a statistical implicit energy function that relies on negotiation tables deduced from homologous protein families which incorporates constraints on residue surroundings during elaboration, and mCSM, a machine literacy algorithm that takes into account the residue 3D physicochemical terrain epitomized as a graph- grounded structural hand [16].
Mutations can be classified into three categories (a) “Good” which increases fitness, (b) “Indifferent or Neutral”, as the effects are too small and, (c) “Bad” which decreases fitness. [16]
ΔΔG results will fall into three categories:
- ΔΔG > 0.5: Positive results suggest that a mutation would be destabilizing. These mutations are residues that are usually avoided during design and can be classified as “Bad”.
- 0.5 > ΔΔG > -0.5: Things that are near 0 are within the noise range so should be considered indifferent or neutral. These can be included in the design to allow more neutral changes in the protein that may compensate for changes in the protein. These can be classified as “Neutral” or “Indifferent”
- ΔΔG < -0.5: Negative results suggest that the mutation would lead to a more stable protein and can be classified as “Good”.
Protein modelling of missense mutations involves predicting the structural and functional consequences of amino acid substitutions that alter the protein sequence. Missense mutations are single-nucleotide variations that change a single amino acid residue in a protein sequence, potentially affecting protein stability, interactions, or enzymatic activity.
There are several computational tools and methods available for protein modelling of missense mutations, including homology modelling, molecular dynamics simulations, and machine learning-based approaches. These methods use various algorithms to predict the effect of a missense mutation on protein structure and function, such as changes in protein stability, folding, dynamics, and interactions. [17-21]
One common approach is to compare the predicted structure and stability of the wild-type protein with that of the mutated protein. If the mutation destabilizes the protein or alters its structural integrity, it may affect the protein's function or interactions with other molecules.
Overall, protein modelling of missense mutations can provide valuable insights into the potential effects of genetic variations on protein structure and function, which can help in understanding the molecular basis of genetic diseases and designing therapeutic interventions.