3.1. Missense nsSNPs datasets
For the biological consequences study of missense nsSNPs, we retrieved 451 missense nsSNPs with 386 rsIDs in the human GABRA6 gene mapped to NM_000811.3 RefSeq (S1 Table) from the dbSNP database. And the protein sequence of the GABRA6 gene (with Q16445 ID) was retrieved from the UniProt database.
3.2. Prediction of most deleterious missense nsSNPs
Four in silico tools including SIFT, PolyPhen-2, PROVEAN, and Condel were used to predict deleterious missense nsSNPs in this study. Out of 451 missense nsSNPs, 303 were predicted as deleterious and 148 as tolerated by SIFT, 306 were predicted as deleterious and 145 as neutral by HumDiv PolyPhen-2 (143 benign, 49 possibly damaging, and 259 probably damaging), 290 were predicted as deleterious and 161 as neutral by HumVar PolyPhen-2 (157 benign, 55 possibly damaging, and 239 probably damaging), 225 were predicted as deleterious and 226 as neutral by PROVEAN, and 298 were predicted as deleterious and 153 as neutral by Condel (S2 Table and Fig. 2).
Figure 2. Distribution of predicted deleterious and neutral missense nsSNPs by 4 tools for human GABRA6 gene. Deleterious nsSNPs shown in dark blue and neutral nsSNPs in light blue.
Deleterious missense nsSNPs were predicted with default scores for each tool. To obtain the high confidence deleterious missense nsSNPs, high scores across the tools are considered. SIFT score = 0, PolyPhen-2 score > 0.99, PROVEAN score < -9, and Condel score > 0.7 are used. After integration of the scores, 3 missense nsSNPs in the GABRA6 gene (Table 1) were obtained which are used for further analysis.
Table 1
The most deleterious missense nsSNPs by SIFT, PolyPhen-2, PROVEAN, and CONDEL tools in the GABRA6 gene.
rs ID | Transcript change | Substitution | SIFT4G Score | PolyPhen-2 Score | PROVEAN Score | CONDEL Score |
HumDiv probability | HumVar probability |
rs1428649051 | c.260G > C | W87S | 0 | 1 | 1 | -12.906 | 0.709363974 |
rs1199782347 | c.334T > C | W112R | 0 | 1 | 1 | -12.837 | 0.735214831 |
rs1317373536 | c.928T > C | C310R | 0 | 1 | 1 | -10.964 | 0.764688222 |
3.3. Prediction of disease-related variants
To get more accurate results selected missense nsSNPs were analyzed by SNPs&GO and PMut to predict disease-related nsSNPs. SNPs&GO's result consists of three different algorithms; SNPs&GO, PhD-SNP, and PANTHER. Three selected missense nsSNPs in the human GABRA6 gene were submitted to SNPs&GO and PMut to analyze disease-related nsSNPs. The output of SNPs&GO, PhD-SNP, PANTHER, and PMut predicted that all three selected missense nsSNPs as disease-related (Table 2).
Table 2
List of disease-related missense nsSNPs of GABRA6 by SNPs&GO and PMut tools.
Substitution | SNPs&GO | PMut |
SNPs&GO | PhD-SNP | PANTHER |
Prediction | RI | Probability | Prediction | RI | Probability | Prediction | RI | Probability | Prediction | Score |
W87S | Disease | 8 | 0.919 | Disease | 8 | 0.922 | Disease | 10 | 0.999 | Disease | 0.8214 |
W112R | Disease | 9 | 0.93 | Disease | 9 | 0.937 | Disease | 10 | 0.979 | Disease | 0.8214 |
C310R | Disease | 9 | 0.949 | Disease | 9 | 0.97 | Disease | 9 | 0.974 | Disease | 0.8627 |
RI: Reliability Index |
3.4. Prediction of the molecular and phenotypic consequences of variants
To predict the molecular and phenotypic consequences of selected missense nsSNPs, we further investigated nsSNPs through SNAP2, MutPred2, and SNPeffect4.0 tools. The SNPeffect tool predicts SNPs’ consequence on aggregation-prone regions by TANGO, amyloid-forming regions by WALTZ, and hsp70 chaperone binding sites by LIMBO. The output of SNAP2 of three selected missense nsSNPs in the GABRA6 gene predicted that all three selected missense nsSNPs have damaging effects on protein structure. Since the score > 0.5 suggests pathogenicity in MutPred2, the output of MutPred2 showed that all three selected missense nsSNPs in the GABRA6 gene have damaging functional and structural effects (Detailed information in S3 Table). According to the SNPeffect results, TANGO and WALTZ analysis revealed that the C310R variant (dTANGO equals 302.70 and dWALTZ equals − 553.68) increases the aggregation tendency of the protein and decreases the amyloid propensity of the protein. However, none of the variations does affect the chaperone binding tendency of the protein (Table 3).
Table 3
List of analyzed missense nsSNPs of the GABRA6 gene by SNAP2, MutPred2, and SNPeffect4.0 tools.
Substitution | SNAP2 | MutPred2 score | SNPeffect4.0 |
Prediction | Score | dTANGO | dWALTZ | dLIMBO |
W87S | effect | 87 | 0.953 | 0.00 | -0.07 | 0.00 |
W112R | effect | 87 | 0.948 | 0.00 | -0.12 | 0.00 |
C310R | effect | 87 | 0.964 | 302.70 | -553.68 | 0.00 |
(d in dTANGO, dWALTZ, and dLIMBO is the score change between the mutant protein and wild-type protein)
3.5. Prediction of the effect of variants on protein stability
The effects of selected missense nsSNPs on protein stability were analyzed by I-Mutant3.0 and MUpro tools by calculating Gibbs free energy change of mutant protein and wild-type protein. I-Mutant3.0 analyses showed W87S and W112R variants decreased protein stability, and the C310R variant increased protein stability. While all three variants by Mupro prediction showed decreasing protein stability. W87S and W112R variants have ΔΔG values < -1 kcal/mol in both tools which are expected to alter the function and structure of the protein (Table 4).
Table 4
List of missense nsSNPs of the GABRA6 gene which was analyzed for protein stability by I-Mutant3.0 and MUpro tools.
Substitution | I-Mutant3.0 | Mupro |
Prediction | ΔΔG | RI | Prediction | ΔΔG |
W87S | Decrease | -1.69 | 9 | Decrease | -1.0174971 |
W112R | Decrease | -1.44 | 9 | Decrease | -1.2580621 |
C310R | Increase | -0.32 | 2 | Decrease | -0.85748451 |
RI: Reliability Index |
3.6. Evolutionary conservation analysis
To analyze the evolutionary conservation of selected missense nsSNPs, the ConSurf tool was used, which grouped amino acids based on conservation scores in 9 grades. As shown in Fig. 3, ConSurf results indicated that W87 and W112 residues located in highly conserved regions with conservation scores of 9 and predicted as buried residues and have a structural impact on the protein, and C310 predicted as buried residue with conservation scores of 8.
Figure 3. ConSurf analysis of GABRA6 gene residues. The black boxes indicate the most deleterious missense nsSNPs.
3.7. Prediction of protein secondary structure
Amino acids’ secondary structure corresponding to the proteins was predicted by SOPMA and PSIPRED tools. SOPMA prediction showed the distributions of alpha helix, extended strand, beta-turn, and random coil in proteins. PSIPRED predicted the distributions of strands, helices, and coils in proteins and validated the secondary structure of the proteins. SOPMA results showed 135 alpha helices (29.80%), 104 extended strands (22.96%), 11 beta turns (2.43%), and 203 random coils (44.81%) in the predicted secondary structure. In the 3 amino acid residues that correspond to selected missense nsSNPs, W87 and W112 are located in random coils, and C310 in alpha helices (Fig. 4). PSIPRED results indicated that in the 3 amino acid residues, W87 is located in strands, Y186 in coils, and C310 in helices (S1 Fig).
Figure 4. SOPMA analysis of GABRA6 gene residues. The black boxes indicate the most deleterious missense nsSNPs.
3.8. Prediction of structural effects of variants
HOPE was used to predict structural changes including size, charge, and hydrophobicity value between mutant and wild residues. Out of 3 selected missense nsSNPs in the GABRA6 gene, Ser and Arg residues, respectively in W87S and W112R mutants, are smaller than Trp residue in wild-types forms, while Arg residue in C310R mutant is bigger than the Cys residue in wild-type form. The charge of Trp and Cys residues respectively in W112R and C310R mutants were neutral, then turn to Arg residues with a positive charge. There is not any significant charge change in the W87S mutant. The wild-type residues in all three nsSNPs are more hydrophobic than the mutant residues. (S4 Table). According to the project HOPE results, when the mutant residue is smaller than the wild-type residue, it might lead to loss of interactions in protein structure, and if the mutant residue is bigger than the wild-type residue, it might lead to bumps in protein structure. When the charge of the wild-type residue is lost, it might cause a loss of interactions with other molecules or residues. If the mutation introduces a charge, it might cause the repulsion of ligands or other residues with the same charge. When the hydrophobicity of the wild-type residue is lost or decreased, the hydrophobic interactions will be lost either in the core of the protein or on the surface. If the mutation introduces a more hydrophobic residue, it might cause a loss of hydrogen bonds and/or disturb correct folding.
3.9. 3D structure modeling and visualizing of wild and mutant structures
The Pfam server was used to identify domain regions in the GABRA6 gene and locate selected nsSNPs positions in different domains. In this study, we just need domains that involved deleterious missense nsSNPs (Table 5). Generating 3D structure models was performed by the I-TASSER server. Predicted models’ validation was done by PROCHECK and ERRAT servers. PROCHECK predicted the quality of wild models by Ramachandran plot analysis. The ERRAT indicated the overall quality factor of the predicted models. Visualization of wild and mutant protein structures was performed by PyMol software.
Table 5
List of domains that involved selected deleterious missense nsSNPs in the GABRA6 gene.
Protein name | Domain name | Start residue | Stop residue | Domain length | Involved nsSNPs |
GBRA6_HUMAN | Neurotransmitter-gated ion-channel ligand-binding domain | 32 | 240 | 209 aa | W87S and W112R |
GBRA6_HUMAN | Neurotransmitter-gated ion-channel transmembrane domain | 247 | 399 | 153 aa | C310R |
Pfam reported two domains in the GABRA6 gene, including the neurotransmitter-gated ion-channel ligand-binding domain (32–240), and a neurotransmitter-gated ion-channel transmembrane domain (247–399). Out of three selected missense nsSNPs for further analysis, two (W87S and W112R) are located in the ligand-binding domain, and one (C310R) is located in the transmembrane domain.
3D structure prediction of wild types for the ligand-binding domain, and the transmembrane domain in the GABRA6 gene was modeled by I-TASSER. For the ligand-binding domain, one model predicted a C-score of 1.16, which was a high-quality model. And in the transmembrane domain, out of 5 predicted models, we selected the first model with the C-score of -1.62 (S2-a Fig and S3-a Fig). Validation of predicted models’ quality was done by PROCHECK and ERRAT servers. The result of PROCHECK for the ligand-binding domain model showed, 79.7% of residues in most favored regions, 18.7% in additional allowed regions, 1.6% in generously allowed regions, and 0% in disallowed regions (S2-b Fig). The ERRAT result showed that the overall quality factor for the predicted model was 87.940 (S2-c Fig). PROCHECK’s result for the transmembrane domain model showed, 77.3% of residues in most favored regions, 12.8% in additional allowed regions, 7.1% in generously allowed regions, and 2.8% in disallowed regions (S3-b Fig). The result of ERRAT showed that the overall quality factor for the predicted model was 83.916 (S3-c Fig).
Finally, the predicted models were used to model mutant types by using the mutagenesis feature in PyMol. Structural models for wild-types and deleterious missense nsSNPs in the ligand-binding domain (W87S and W112R) and the transmembrane domain (C310R), are shown in Fig. 5 to Fig. 7.
Figure 5. (W87S)
The amino acid Tryptophan (green) changed to Serine (red) at position 87 in the ligand-binding domain. Visualization was done by PyMol software and HOPE result.
Figure 6. (W112R)
The amino acid Tryptophan (green) changed to Arginine (red) at position 112 in the ligand-binding domain. Visualization was done by PyMol software and HOPE result.
Figure 7. (C310R)
The amino acid Cysteine (green) changed to Arginine (red) at position 310 in the transmembrane domain. Visualization was done by PyMol software and HOPE result.
3.10. Prediction of ligand binding sites
To determine the presence of the selected deleterious missense nsSNPs in protein-binding regions of the GABRA6 gene, we employed the FTSite and COACH algorithms, which are protein-ligand docking tools. FTSite identified three potential binding sites in the ligand-binding domain of the protein.
FTSite predicted three potential binding sites in the ligand-binding domain. Site 1 contains 15 binding residues, site 2 contains 4 binding residues, and site 3 contains 12 binding residues (Fig. 8-a and S5 Table). Additionally, COACH predicted 20 amino acid residues as potential binding sites (S6 Table).
Similarly, in the transmembrane domain, FTSite predicted three potential binding sites. Site 1 contains 9 binding residues, site 2 contains 16 binding residues, and site 3 contains 6 binding residues (Fig. 8-b and S5 Table). COACH predicted 44 amino acid residues as potential binding sites (S6 Table).
Figure 8. Illustration of GABRA6, ligand-binding domain (A), and transmembrane domain (B), with ligand-binding site predictions. FTSite were annotated the ligand-binding pockets. Site 1 illustrated in red, while site 2 and 3 illustrated in green and blue, respectively.
3.11. Molecular dynamics simulation of WT and mutant types
To comparatively study the conformational changes of the WT and mutant types in physiological environments, we performed 100 ns molecular dynamic simulations for each domain. Various parameters, such as root mean square deviations (RMSD), root mean square fluctuations (RMSF), the radius of gyration (Rg), and solvent accessible surface area (SASA), were analyzed using the time-dependent function of molecular dynamic simulation. These parameters were calculated for Cα atoms during the molecular dynamics simulations, with reference to their WT structures.
The RMSD analysis of the WT and mutant types revealed significant deviations in their structural stability. The W87S and W112R mutants in the ligand-binding domain exhibited similar RMSD values. Compared to the WT structure, the mutant types in the ligand-binding domain showed higher fluctuation, as depicted in Fig. 9-a. The C310R mutant in the transmembrane domain also deviated from the WT structure, with the mutant type displaying lower fluctuation than the WT, as shown in Fig. 9-b. The average RMSD values for the WT, W87S, and W112R are 1.88 Å, 2.13 Å, and 2.38 Å, respectively. Furthermore, the WT and C310R mutant have average RMSD values of 6.18 Å and 4.89 Å, respectively.
Figure 9. Analysis of RMSD values of the domain backbone of WT and mutant structures, ligand-binding domain (A), and transmembrane domain (B) at 100 ns simulation. The ordinate is RMSD (Å), and the abscissa is time (ns).
The RMSF analysis of each residue illustrates the effect of mutations on their dynamics. The analysis of RMSF values revealed significant differences in fluctuation between the WT and mutant structures in the N-terminal region of the W87S mutant and the N-terminal region as well as positions 90–99 in the W112R mutant in the ligand-binding domain after 100 ns of molecular dynamic simulation (Fig. 10-a). Similarly, in the transmembrane domain, the analysis of RMSF values indicated significant differences in fluctuation between the WT and mutant structures in positions 38–57, 77–92, and the C-terminal region of the C310R mutant (Fig. 10-b). The RMSF plots show that residues in positions 1–22 and 65–100 of the W112R mutant in the ligand-binding domain, as well as residues in positions 39–53 and 77–99 of the C310R mutant in the transmembrane domain, exhibit a relatively flexible region compared to other residues. Additionally, the highest residual fluctuation is observed at positions 1 (8.01 Å) and 2 (5.96 Å) in the W87S mutant, positions 1 (6.90 Å) and 2 (4.84 Å) in the W112R mutant, and positions 85 (7.95 Å) and 86 (7.71 Å) in the C310R mutant, when compared to their respective WT structures.
Figure 10. Analysis of RMSF values of the domain backbone of WT and mutant structures, ligand-binding domain (A), and transmembrane domain (B) over the entire simulation. The ordinate is RMSF (Å), and the abscissa is residue.
Based on the Rg analysis of the WT and mutant structures, it is evident that the W112R mutant exhibits a higher average Rg value (19.83 Å) in the ligand-binding domain compared to the WT (19.48 Å) and W87S mutant (19.41 Å), as depicted in Fig. 11-a. The WT and W87S mutant structures in the ligand-binding domain display similar average Rg values. Conversely, in the transmembrane domain, the C310R mutant demonstrates a significantly lower average Rg value (16.31 Å) compared to its WT structure (17.03 Å), as illustrated in Fig. 11-b. This suggests a potential decrease in the flexibility of the C310R mutant, and interestingly, the C310R mutant appears to deviate from its Rg value after 30 ns.
Figure 11. Analysis of Rg of the domain backbone of WT and mutant structures, ligand-binding domain (A), and transmembrane domain (B) over the entire simulation. The ordinate is Rg (Å), and the abscissa is time (ns).
Furthermore, the SASA analysis reveals that in the ligand-binding domain, the W112R mutant exhibits a higher average SASA value (11663.50 Ų) than the WT (11468.27 Å), while the W87S mutant displays a lower average SASA value (11370.25 Ų) than the WT, as shown in Fig. 12-a. In the transmembrane domain, the C310R mutant exhibits a lower average SASA value (8796.62 Ų) compared to its WT (8936.70 Ų), as depicted in Fig. 12-b. Since a higher SASA value indicates protein expansion, it can be inferred that in the ligand-binding domain, the WT and W87S mutant are more stable than the W112R mutant, and the W87S mutant is more stable than the WT. In the transmembrane domain, the C310R mutant is more stable than its WT structure.
Figure 12. Analysis of SASA of the domain backbone of WT and mutant structures, ligand-binding domain (A), and transmembrane domain (B) over the entire simulation. The ordinate is SASA (Ų), and the abscissa is time (ns).
To further investigate the structural changes resulting from substitutions in the GABRA6 gene in each mutant type, the number of different secondary structures present in the mutated types was calculated and compared with the wild types, as presented in Supplementary Figs. 4 to 10. Additionally, the contribution of different secondary structures in the protein structure during the simulation is summarized in Table 6. Moreover, to understand the secondary structural changes of the mutant types, the DSSP parameter was calculated during the simulation, as shown in Figs. 13 and 14.
Figure 13. Variation in secondary structure elements for WT (A), W87S (B), and W112R (C) structures with respect to time, in the ligand-binding domain.
Figure 14. Variation in secondary structure elements for WT (A), and C310R (B) structures with respect to time, in the transmembrane domain.
Table 6
Contribution of different secondary structures in protein structure during the simulation.
Protein | Structure | Coil | β-sheet | β-bridge | Bend | Turn | α-helix | 5-helix | 310-helix |
WT ligand-binding domain | 0.63 | 0.22 | 0.48 | 0.00 | 0.13 | 0.12 | 0.02 | - | 0.02 |
W87S mutant | 0.62 | 0.23 | 0.46 | 0.01 | 0.12 | 0.14 | 0.01 | - | 0.03 |
W112R mutant | 0.60 | 0.26 | 0.45 | 0.00 | 0.13 | 0.13 | 0.02 | - | 0.02 |
WT transmembrane domain | 0.63 | 0.15 | 0.00 | 0.00 | 0.13 | 0.21 | 0.42 | 0.00 | 0.08 |
C310R mutant | 0.58 | 0.17 | 0.00 | 0.00 | 0.15 | 0.17 | 0.41 | 0.02 | 0.08 |
(Structure = α-helix + β-sheet + β-bridge + turn) |
The average coil percentage in the ligand-binding domain was determined to be 22.09%, 22.86%, and 25.85% for the WT, W87S mutant, and W112R mutant structures, respectively (S4-a Fig). In the transmembrane domain, the average coil percentage for the WT and C310R mutant structures was found to be 15.49% and 17.28%, respectively (S4-b Fig). Secondary structure analysis in the ligand-binding domain reveals a significant difference in fluctuation between the WT and W112R mutant structures in the coil region residues. Similarly, in the transmembrane domain, a notable difference in fluctuation in the coil region can be observed between the WT and C310R mutant structures after approximately 50 ns of molecular dynamics simulation.
In the ligand-binding domain, the average percentage of β-sheet was found to be 48.32%, 46%, and 44.54% for the WT, W87S mutant, and W112R mutant structures, respectively (S5-a Fig). However, in the transmembrane domain, the average β-sheet percentage was practically 0 (S5-b Fig). The secondary structure analysis in the ligand-binding domain demonstrates a considerable difference in fluctuations between the WT and mutant structures, particularly in the residues of the β-sheet region.
The average β-bridge percentages in the ligand-binding domain were determined to be 0.35% for the WT structure, 0.83% for the W87S mutant structure, and 0.28% for the W112R mutant structure (S6-a Fig). In the transmembrane domain, the β-bridge percentages were found to be 0.03% for the WT structure and 0.04% for the C310R mutant structure (S6-b Fig).
The average percentages of bend were found to be similar among the WT, W87S mutant, and W112R mutant structures in the ligand-binding domain, with values of 12.69%, 12.11%, and 12.65% respectively (S7-a Fig). In the transmembrane domain, the average bend percentages were 12.85% for the WT structure and 14.75% for the C310R mutant structure (S7-b Fig). Notably, the C310R mutant structure exhibited a significant increase in fluctuation in the bend region residues during the first 30 ns compared to the WT structure in the ligand-binding domain.
Regarding the average turn percentages, the ligand-binding domain showed values of 12.10% for the WT structure, 14.06% for the W87S mutant structure, and 13.43% for the W112R mutant structure (S8-a Fig). In the transmembrane domain, the turn percentages were determined to be 20.67% for the WT structure and 16.60% for the C310R mutant structure (S8-b Fig). Notably, there were significant differences in fluctuation between the WT and three mutant structures in both domains within the turn region residues.
In terms of average α-helix percentages, the ligand-binding domain exhibited values of 2% for the WT structure, 1.18% for the W87S mutant structure, and 1.62% for the W112R mutant structure (S9-a Fig). In the transmembrane domain, the average α-helix percentages were 42.47% for the WT structure and 41.37% for the C310R mutant structure (S9-b Fig). Notably, there was a considerable difference in fluctuation in the α-helix region residues between the WT and C310R mutant structures throughout the molecular dynamics simulation times in the transmembrane domain.
No 5-helix secondary structure was observed in the ligand-binding domain. However, in the transmembrane domain, the average percentages of 5-helix were 0.21% for the WT structure and 1.95% for the C310R mutant structure.
Lastly, the average 310-helix percentages in the ligand-binding domain were found to be 2.41% for the WT structure, 2.93% for the W87S mutant structure, and 1.60% for the W112R mutant structure (S10-a Fig). In the transmembrane domain, the average 310-helix percentages were 8.24% for the WT structure and 7.98% for the C310R mutant structure (S10-b Fig). Notably, a decrease in fluctuation was observed in the W112R mutant structure compared to the WT structure in the ligand-binding domain. Conversely, an increase in fluctuation was visible in the transmembrane domain of the C310R mutant structure compared to the WT structure.