SNP retrieval
The SNPs were retrieved from the SNP database of the National Centre for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/snp) with various limits (Homo-sapiens, missense) [32]. An overview of the methodology is shown in Fig. 2.
Predictions Of Deleterious Effects
In silico tools were employed to distinguish detrimental SNPs from the retrieved SNPs. SIFT (Sorting intolerant from tolerant) uses sequence homology to predict if an amino acid substitution will alter the protein function [17]. If the normalised value is less than the cut-off value (in our case, 0.05), the substitution is likely deleterious. Polymorphism phenotyping (PolyPhen 2.0) is a tool that recognises the possible result of amino acid substitutions based on structure and function. The mutations are categorised as ‘probably damaging,’ ‘possibly damaging,’ and ‘benign’ based on the false-positive rate (FPR) threshold [33]. Single-site mutations can lead to changes in protein stability. The protein stability change can be predicted by I-Mutant [34]. The output provides the difference in Gibbs free energy values (DDG/DDG) between mutated and native types (kcal/mol). Protein analysis through evolutionary relationships (Panther) classifies proteins according to family, subfamily, molecular functions, etc [35]. Apart from these four, we have used Functional Analysis through Hidden Markov Models v2.3 (FATHMM). FATHMM is a high-throughput web-server that can predict the functional consequences of both coding variants, i.e., non-synonymous single nucleotide variants (ns-SNVs), and non-coding variants in the human genome, classifies the substitutions as tolerated or damaging [36].
Structure modelling and root mean square deviation (RMSD) calculations.
The mutation leads to significant changes in the structure and stability of proteins. Three-dimensional structure analyses were performed between the native and mutant to evaluate these changes. SWISS-MODEL, an automated homology modelling software, was used to model the 3D structure of the protein. PROCHECK was used to validate the 3D structures; then, the mutants were generated through Swiss PDB Viewer [37]. The energy of both native and mutants was minimised using NOMAD-Ref, which uses GROMACS by default for minimisation [38]. Finally, the differences in total energy were computed using the minimised structures. Swiss PDB Viewer was used to computing the root mean square value deviation (RMSD) between the Native and mutants.
Trajectory Analysis
The stabilising residues are defined by SRide based on their long-range interactions, hydrophobicity, and amino acid residue conservation [39]. Further investigation was done using PSIPRED, a simple and precise secondary structure prediction method. It analyses the output obtained from PSI-BLAST by including two feed-forward neural networks [40]. The functional regions in proteins were identified by using ConSurf [41]. Apart from these, we have used Missense-3D, which predicts structural changes introduced by an amino acid substitution & is applicable to analyse both PDB coordinates and homolog position-Specifictures [42].
Admet Analysis
SWISS ADME software of the Swiss Institute of Bioinformatics and the pkCSM server were used to estimate individual ADME behaviour of the phytocompounds, i.e. resveratrol, capsaicin, and rosamarinic acid [43, 44]. The toxicity of the phytocompounds was evaluated by using ProTox-II. ProTox-II predicted various toxicity endpoints like the compounds' hepatotoxicity, acute toxicity, carcinogenicity, mutagenicity, cytotoxicity, and immunotoxicity. The similarity between the query molecule’s functional group with those reported and found in the software database forms the basis of prediction.
Molecular Docking
FTO’s crystal structure (PDB ID:4QHO) was retrieved from RCSB, all 5 SNPs were modelled, and energy was minimised using the SWISS-PDB viewer [45]. The drug molecules capsaicin, rosamarinic acid and resveratrol were acquired from PubChem (Compound CID: 1548943, 445154, and 5281792, respectively) in SDF format converted to PDBQT after the addition of hydrogen using Open Babel. The drug molecules were used as ligands, and the energy-minimised native FTO structure and 5 mutants harbouring the solitary SNPs were used as receptors for the docking studies. Protein preparation for the docking was done using AutoDock Tools (version 1.5.6) (ADT). The PDB structure was converted to PDBQT format after removing crystal water molecules and native ligands, assignment of Kollman charges, and adding polar hydrogen. The ligand charges were merged, and non-polar hydrogens were removed in ADT. The search space was defined according to the binding site of FTO, constituted by the residues Arg95, Tyr108, Asn205, His231, Asp233, Tyr295, His307, Thr320, and Arg322 [46]. Docking was performed using AutoDock Vina (version 1.1.2) [47] with exhaustiveness. The interaction of the drug molecules with the active site was analysed using Schrödinger Maestro. Binding pose with the best binding affinity and maximum interactions with the active site residues was selected for MD simulations.
Molecular Dynamics Simulations
The docked complexes of five FTO mutants and the native protein with the three selected drug molecules capsaicin, resveratrol, and rosamarinic acid were subjected to Molecular Dynamics Simulation using GROMACS (version 2018.7) software suite by earlier described methods [48]. GROMOS 54a7 force field was used to build protein topology, while ligand topologies were generated using the PRODRG server [49]. A cubic box was generated extending 10Å from the protein atoms in all directions and solvated using SPC three-point water model. The system was neutralised by adding an appropriate number of sodium atoms. The solvated electro-neutral system was energy minimised to eliminate steric clashes using the steepest descent algorithm with a 1000 kJ/mol/nm convergence criterion. A cut-off of 1.2 nm was used for computing both van der Waals and Coulombic interactions, while long-range electrostatic interactions were handled using Particle-mesh Ewald (PME). The complex's equilibrium was performed first in a canonical ensemble (NVT) by restraining solvent and ions at 300 K for 100 ps. This was followed by NPT equilibration for 100 ps, where the restraint weight on the protein-ligand complex was gradually reduced. LINC algorithm was used as the bond constraint algorithm, while constant temperature and pressure were simulated using Berendsen’s thermostat and Parrinello-Rahman pressure coupling, respectively [50, 51]. A production simulation was performed for the duration of 100 ns with a step size of 2 fs using a leapfrog dynamics integrator. Post-simulation analysis was performed using GROMACS modules and in-house python scripts.