In Silico Prediction of Deleterious Non-Synonymous SNPs of Human GABRA2 gene and Altered Protein Structure and Function

Materials And Methods

In this study, an in silico analysis was carried out by acquiring datasets from NCBI database, protein databank and analyzing using the software as indicated below:

Datasets:

The SNPs (n=7453) of human GABRA2 gene coding GABRA2 protein (NCBI Accession: NP_000798) was retrieved from the NCBI dbSNP database. The SNPs are considered to be deleterious when they are linked to disease states. The SNPs belonging to different functional classes were obtained from the database and are shown in Figure 1. Among the SNPs, 93 were missense variants, other SNPs occurred in intronic region (n=6811), 3’ untranslated region (UTR) (n=329), 5’UTR (n=134), coding synonymous (n=72), non-sense variants (n=3), stop-gains (n=3) and frameshift variants (n=8). Only the missense variants were selected for further analysis. Validation of 93 missense SNPs was carried out using Ensembl and UCSC browsers. A total of 86 SNPs were selected for further analysis including prediction of their protein structure, stability and function.

In silico methods:

SIFT (Sorting Intolerant From Tolerant) program available at https://sift.bii.a-star.edu.sg/ was used to predict the deleterious or damaging nature of the 86 missense SNPs. This program is based on sequence homology, physical properties of amino acids and the degree of evolutionary conservation of the sequence among various species.

SIFT predictions are given as either "damaging" or "tolerated". The former indicates that the substitution is predicted to affect protein function and the latter indicates that the substitution is predicted to be functionally neutral. A SIFT score of zero indicates evolutionary conserved and intolerance towards substitutions, while scores close to one indicate tolerance towards substitution. Scores <0.05 are predicted by the algorithm to be intolerant or highly deleterious while scores >0.05 are regarded as highly tolerant of substitutions. Each of the programs listed below was used to independently analyze the 86 missense SNPs in the GABRA2 gene:

Polyphen (Polymorphism and Phenotyping) server available at http://genetics.bwh.harvard.edu/pph2/ was used to screen and predict the deleterious nsSNPs that are based on the observable structural changes induced by the nsSNPs. PANTHER (Protein Analysis through Evolutionary Relationships) server available at http://pantherdb.org was used to calculate the duration of a given amino acid that has been evolutionarily preserved among various species and predicts the effect of the specific amino acid change on the structural and functional aspects of the protein. The longer the amino acid is conserved during evolution, the greater the likelihood of having functional importance in protein structure and function.

PROVEAN (Protein Variation Effect Analyzer) server available at http://provean.jcvi.org/index.php was used to predict if single or multiple indels and substitutions in the amino acid sequence affect protein function. The program utilizes clustering of BLAST hits with 75% global sequence identity. The top 30 clusters of closely related sequences are used to generate the prediction by the program. Each supporting sequence is assigned a delta alignment score which is then averaged within and across clusters to generate the PROVEAN score. The score ≤-2.5 indicates the protein variant predicted has a "deleterious" effect.

Mutation Assessor program that predicts the functional impact of amino acid substitutions in proteins. In this program, the functional impact is assessed based on the evolutionary conservation of the affected amino acid in protein homologs. Prediction of pathological (disease-associated) mutations is carried out using PMut http://mmb.irbbarcelona.org/PMut. The final output is displayed as a pathogenicity index ranging from 0 to 1 (indexes > 0.5 single pathological mutations) and a confidence index ranging from 0 (low) to 9 (high).

I-Mutant 3.0 (http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi), a support vector machine (SVM) based tool was used to assess the 86 SNPs in the GABRA2 gene. It classifies the prediction as (i) neutral mutation (− 0.5 ≤ DDG ≤ 0.5 kcal/mol) (ii) large decrease (≤− 0.5 kcal/mol) and (iii) large increase (> 0.5 kcal/mol). The protein stability changes impacted by single point mutations is predicted using this program. This program was used both as a classifier for predicting the sign of the protein stability change upon mutation and as a regression estimator for predicting the related change in Gibbs-free energy (ΔΔG). I-Mutant Disease (Predictor of human Deleterious Single Nucleotide Polymorphisms) built within I-Mutant suite was used.

MutPred (http://mutpred.mutdb.org/) based upon SIFT algorithm and a gain/loss was used for 14 predicted structural and functional properties. MutPred is a web application tool developed to classify an amino acid substitution (AAS) as disease-associated or neutral in humans. In addition, it also predicts the molecular cause of disease/deleterious AAS.

**Modeling of the GABRA2 protein structure:**

Crystallized protein structures of GABRA genes were not available and therefore, computational modeling based on homology prediction was utilized to construct the reference protein structures. I-TASSER web server (http://zhanglab.ccmb.med.umich.edu/I-TASSER/)was employed for 3D protein modeling. I-TASSER generated 5 models, of which the model that had the highest confidence score (C-score) along with RMSD (Root Mean Square Deviation) score and TM (Template Modeling)-score was selected for further analysis.

Superimposition of wild type - mutant proteins and RMSD calculation

Each mutant model (14 models) was generated using the “mutation tool” in Swiss-PDBViewer. The mutation tool was used to replace the native amino acid with the “best” rotamer of the new amino acid. Energy minimization for the predicted models was performed with the GROMOS 43B1 field implementation of Deep View v4.1 tool (https://spdbv.vital-it.ch/energy_tut.html). This force field was built to evaluate the energy of a protein structure as well as repair distorted geometries through energy minimization.

Energy minimization for both the native and the mutated protein models was carried out using this program. The RMSD values of the atoms upon superimposing the native and the mutant protein structure was calculated using Swiss-PDBViewer by the “Calculate RMS” function. The extent of structural deviation between the native and the mutant protein structures associated with a functional effect on the protein was predicted by calculating the RMSD by superimposing the native and protein structures. The higher the RMSD value, the structural deviation is more likely to be associated with the altered function of the protein. The stability of the mutant protein structure was then analyzed by the I-Mutant server (http://folding.biofold.org/i-mutant/i-mutant2.0.html).

Validation of the native and the mutant model using Ramachandran Plot

The Ramachandran Plot was used to calculate the dihedral angles of the amino acid residues and to predict the energetically allowed residues based upon their phi and psi dihedral angles, thereby ascertaining the structural and functional properties of the protein structure. The energy minimized native and the mutant protein models were validated with the online tool RAMPAGE program (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php).

ConSurf tool (http://consurf.tau.ac.il/2016/) was used to estimate the evolutionary conservation of amino acid positions in the GABRA2 protein sequence. This analysis is based on phylogenetic relations between homologous sequences. The degree of conservation of amino acid residues was estimated using default program settings. The highly conserved residues were identified and the residues (exposed/buried) in the protein structure located at the sites of high-risk nsSNPs were identified. The conserved regions were predicted utilizing colouring scheme and conservation scores (conservation scores: 1–4 variables, 5–6 intermediate, and 7–9 conserved).

GeneMANIA, an online database (http://www.genemania.org/) that predicts the function of the genes and gene sets using a very large set of functional association data was used to analyze the 86 SNPs. The web interface generates hypotheses about gene function, analyzing gene lists and prioritizing genes for functional assays. A list of functionally similar genes identified using available genomics and proteomics data were generated using a very large set of functional association data.

Results

Fourteen nsSNPs were common to those predicted by all the six programs (SIFT, PolyPhen, PANTHER, PROVEAN, Mutation Assessor and PMut); these 14 SNPs are listed in Table 1. The details of the SNPs as identified by individual programs are shown in supplementary tables 1-6. Among the 14 nsSNPs, I-Mutant analysis predicted 13 nsSNPs to be associated with decreased stability, except one, rs765251624 (R58T) that showed an increase in the stability (Table 1). The variants that were predicted to have decreased stability were also found to have increased RMSD values. The cross-validation of the results of I-Mutant Disease in I-Mutant suite 3.0 showed the three variants (V208F, P426S, L170S) were neutral and the remaining variants were predicted to be associated with disease. The RMSD values were in keeping with the I-Mutant results. In the analysis, one variant (I255T) was marked as unknown and RI and DDG value were not obtained from the program.

A total of 86 nsSNPs were identified from the database and analyzed using different in silico programs. Of these, 29 nsSNPs were predicted to be functionally deleterious (affecting protein structure) by the SIFT server showing a highly deleterious tolerance index score of 0.00. The other remaining variants that were predicted as “tolerated”. Functionally deleterious nsSNPs as predicted by SIFT program are shown in supplementary table 1.

The PolyPhen server predicted 42 nsSNPs of 86 SNPs to be functionally deleterious to the protein structure. Of which, 29 nsSNPs were predicted to be “probably damaging” with the score ranging from 0.96 to 1.00 and 13 nsSNPs were predicted to be “possibly damaging” with the score ranging from 0.454 to 0.933 (supplementary table 2). The PANTHER server predicted 76 nsSNPs to be damaging and the remaining nsSNPs were predicted to be benign (supplementary table 3). Among 86 nsSNPs, 22 nsSNPs that were predicted as deleterious were also predicted as deleterious by the other three programs viz. SIFT, PolyPhen, and PANTHER server.

The PROVEAN server predicted 25 nsSNPs to be functionally damaging out of the 86 nsSNPs submitted for analysis (supplementary table 4). Of these, 18 nsSNPs were also predicted by SIFT, PolyPhen, and PANTHER servers. Mutation assessor generated 25 nsSNPs predicted to be associated with a diseased phenotype. Of those, 15 nsSNPs were also predicted by SIFT, PolyPhen, PANTHER, and PROVEAN servers (supplementary table 5).

The functional impact of 86 deleterious nsSNPs in GABRA2 protein was analyzed using PMut server. Of the 86 nsSNPs, 37 are classified as pathological, and the remaining were neutral. Among those, 14 nsSNPs were common to those predicted by the above five servers (SIFT, PolyPhen, PANTHER, PROVEAN and Mutation Assessor) (supplementary table 6). MutPred was used to determine the tolerance degree for each amino acid substitution based on physio-chemical properties. The variants (P445H, P280S, I255T, V208F, S186C, L170S, W122L, I121N, F93C, Y73C, R58T and T43A) were predicted to cause potential structural and functional changes in the protein. A188T was predicted to cause less significant functional change. The results of the Mutpred prediction server are shown in Table 2.

Studies on protein mutation and energy minimization of the native and mutated protein

The 14 nsSNPs predicted to be potentially deleterious by all the 6 programs were mapped into the GABRA2 protein using the “mutation tool" in Swiss-PDBViewer to replace the native amino acid with a new one. Energy minimization of both the native and each of the mutant proteins was done with the help of Swiss-PDBViewer. The resulting energy values of the native and the mutant structures are given in Table 1. The total energy of the native protein structure was determined to be -19946.123 KJ/mol. Among the 14 mutants, R58T was found to have the highest energy of -19646 KJ/mol, even after energy minimization, when compared with the native structure. The remaining 13 mutants were found to have lower energy values ranging from -13567 to -13857 KJ/mol.

The total energy comparison showed that mutants have lower energy values than the GABRA2 protein. The difference in the RMSD value for each of the three mutant proteins compared to native protein were 0.00Å. The RMSD value of the other mutants was higher than that for the native protein ranging from 0.23Å to 0.15Å, indicating structural changes. The higher the RMSD value, the more will be the deviation between the native and the mutant protein structures; this, in turn, alters the protein's stability and functional activity. The native GABRA2 protein as predicted by I-TASSER program is shown in Figure 2 highlighting the identified deleterious SNPs and the position in the Neurotransmitter-gated ion-channel ligand-binding domain. The superimposed mutant protein structures are shown in Figure 3.

Validation of the native and the mutant model using Ramachandran Plot.

The energy minimized native and mutant protein structures in .pdb formats were submitted to RAMPAGE for validating the protein structure using the Ramachandran plot. The results are shown in Figures 4 and 5. The native protein model contains 339 residues (75.5%) in the favoured region, 67 residues (14.9%) in the allowed region and 15 residues (5.3%) in the outlier region. The mutant protein models also showed similar results (Table 1) indicating that there are no major structural changes in mutant protein models compared to the native protein model.

Conservation analysis of GABRA2 protein

ConSurf analysis identified conserved residues in GABRA2 protein and predicted residues to be exposed or buried in the GABRA2 protein structure (Fig6). ConSurf exploits evolutionary variation in multiple sequence alignments to determine the degrees of conservation. The results show that among predicted 14 deleterious nsSNPs, ten nsSNPs (T43A,R58T,F93C,I121N,W122L,S186C,A188T,I255T,P280S,P426S) occurred in conserved sites.

The results of biological interaction network analysis by GeneMania are shown in Figure 7. The GABRA2 gene is predicted to interact with other genes and the major function was shown to have neurotransmitter receptor activity and neuron-neuron synaptic transmission. Figure 7 shows the gene-gene interactions of GABRA2. The most important interactions are with GABRA1, GABRA3, GABRA6, GABRA4, GABRG2, GABRA6 and GABRB2.

Discussion

The development of alcohol dependence is a complex and dynamic process that needs to be investigated in terms of comorbidities with psychiatric disorders especially to develop new psychotherapeutic and pharmacotherapeutic options (Farren et al. 2012). Studies support the importance of genetic influences in substance abuse and dependence. Some specific genes of interest are associated with alcohol use disorders (Mayfield et al. 2008). A study reported from Germany has documented GABRA2 gene sequence variation concerning alcohol abuse. Of the four haplotypes investigated, T-CA-C-A-T-T-C haplotype was significantly more often present in alcohol-dependent subjects compared to controls (Soyka et al. 2008).

Strong associations between SNPs in the gene encoding the alpha2 subunit of the GABRA2 was found with alcohol dependence and affecting brain oscillations as seen in distinct electroencephalography patterns (Edenberg et al. 2004). The link between alcohol abuse and SNPs in GABRA2 was found in subjects with illicit drug dependence (Agrawal et al. 2006). Genetic variations particularly, nsSNPs resulting in amino acid changes disrupt potential functional sites responsible for protein activity, structure, or stability (Schaefer et al. 2012). Investigating the role of nsSNPs in the structural and functional changes in GABRA2 will help in understanding the genetic mechanism of alcohol dependence associated with impulsive disorders. In the study on 295 Americans, of whom 97% of patients were of Caucasian origin, primarily the intronic regions of GABRA2 gene were analyzed to detect association with impulsiveness and lifetime alcohol problems. Our study, looked at missense variants in exons collected from the NCBI database representing the diverse population data as this would directly affect the GABRA2 protein stability and function.

The enormous human genomic sequence information obtained from large-scale projects are helpful in several computational approaches to identify the protein mutants in terms of single amino acid changes that disrupt gene functions. Several prediction tools have been developed to identify amino acid variants. The programs such as SIFT, PolyPhen-2, Mutation Assessor, MAPP, PANTHER, LogR.E-value, Condel and several others predict the effect of missense variants on protein function. The SIFT program prediction through PSI-BLAST indicated 29 nsSNPs as damaging with scores ≤ 0.05. The 14 commonly predicted nsSNPs had the least score of zero indicating the high predictive ability of the program.

PolyPhen-2 uses eight sequence-based and three structure-based predictive features by comparing the property of the wild-type and the corresponding mutant allele that together defines an amino acid replacement (Adzhubei et al. 2010). Our SNP data were analyzed in terms of pph2_prob (classifier probability of the variation being damaging), pph2_FPR [classifier model False Positive Rate (1 - specificity) at the above probability] and pph2_TPR [classifier model True Positive Rate (sensitivity) at the above probability]. Among the 14 nsSNPs identified by all the six programs, the 13 nsSNPs had high pph2_prob compared to other predicted mutations ranging from 0.99 to 1 and were predicted as probably damaging.

The PANTHER program estimates the likelihood of a particular nonsynonymous coding SNP to cause a functional impact on the protein (Tang et al. 2016). The position-specific evolutionary preservation (PSEP) tool employed in the PANTHER program uses a distinct metric based on evolutionary preservation wherein it calculates the length of time (in millions of years) a given amino acid has been preserved in the lineage leading to the protein of interest. The longer the preservation time, the greater the likelihood of functional impact. The 76 SNPs that included the 14 nsSNPs were identified as probably damaging.

Out of the 86 nsSNPs, the PROVEAN server program predicted 25 nsSNPs to be functionally damaging, of which, 18 nsSNPs were predicted by other programs such as SIFT, PolyPhen, and PANTHER servers. The 14 nsSNPs were predicted by all six programs with scores ranging from − 3.13 to -11. 96. The prediction accuracy of this program for human protein variations was reported to be 79.5%. Choi et al. (2012) compared the performance of PROVEAN with the results from two different protein databases. This included the NCBI NR (non-redundant) protein and the UniProtKB/Swiss-Prot protein databases. Their results indicated a reduced accuracy of 7% when using the UniProtKB/Swiss-Prot database instead of the NCBI NR protein database. The authors highlight the usefulness of the program to identify deleterious single nucleotide variants and variants that cause protein sequence indels. We found this program useful for predicting deleterious SNPs consistent with other prediction servers.

To predict the pathology of the identified 14 nsSNPs, the PMut program was utilized. Of 86 nsSNPs tested, 37 including the 14 nsSNPs were predicted to be associated with a pathological disease. PMut is reported to be a powerful tool to predict the functional consequences of protein sequence variants (Lopez-Ferrando et al. 2017).

We utilized six different bioinformatics programs (SIFT, PolyPhen, PANTHER, PROVEAN, Mutation Assessor and P-Mut) that use different methods to predict the nsSNP with deleterious effect on the GABRA2 protein function. Fourteen, out of 86 nsSNPs that were predicted as most damaging in all 6 programs that we used were further analyzed for structural stability changes.

Among the 14 nsSNPs, 10 were found in the conserved regions of the protein sequence as identified by ConSurf analysis. The nsSNPs that are located at highly conserved amino acid positions tend to be more deleterious than nsSNPs that are located at non-conserved sites. In general, highly conserved amino acids either buried (structural) or exposed (functional) act as biologically active sites compared to other residues. Any substitutions in these functional residues may either lead to complete loss of biological functions or cause severe deleterious effects compared to other polymorphisms of the non-conserved site (Dakal et al. 2017).

The functional and structural sites were identified in ConSurf program that combines evolutionary data and solvent accessibility predictions. In our study, among the 14 nsSNPs, 6 nsSNPs (P445H, P280S, V208F, S186C, T43A, R58T) were found in the exposed surface and 8 (I255T, L170S, W122L, I121N, F93C, Y73C, F93C, A188T) were buried.

We, therefore, analyzed the predicted structural consequences using tools available in Swiss-PDB viewer. The 3D structures of variants and the wild type were generated in I-TASSER program. The model that had a high C-score for each of the variants and the wild type was selected for analysis in the DeepView Swiss-PDBViewer.

The total energy of the native protein structure was determined to be -19946 KJ/mol. Among all the 14 mutants, the mutant R58T was found to have the highest energy of -19646 KJ/mol after energy minimization. The remaining 13 mutants were found to have much lower energy values ranging from − 13567.015 to -13857.964. The variant that had a high difference in RMSD score of 0.150Å is S186C. The Ramachandran plot as analyzed in RAMPAGE program indicated no major changes in terms of shifts to the favoured region or allowed regions. The number of residues in both regions remained the same. However, the I-Mutant suite which was used to analyze the protein structural stability changes predicted 10 of 14 nsSNPs as disease-related and 3 nsSNPs as neutral polymorphisms. The reliability index ranged from 2 to 10 for the 13 variants.

In our study, of 14 variants analyzed, one variant, R58T had DDG value of 0.23 indicating a weak effect and 12 other variants (excluding I255T) had scores less than − 0.5 indicating largely destabilizing effect and two variants, W122L and S186C with near scores of -0.49 and − 0.47 respectively indicating weak effect. Interestingly, among the 14 variants analyzed, 13 except P445H were present in the neurotransmitter-gated ion-channel ligand-binding domain of the protein. This indicates the potential role of the SNPs in the functionality of the protein.

While studying alcohol abuse, we have to be conscious of "Emergent Complexity" i.e. alcoholism could be a product of multi-factorial elements contributing to this psychiatric condition. One of several such elements would be the polymorphism of important proteins which contribute to change in function of pathways in the central nervous system. One such observation has been the SNPs in GABRA2 gene. Our study showed the effect of the SNPs in the structure and function of the protein (Agrawal et al. 2012).

The MutPred program predicts the impact of single amino acid substitutions on more than 50 different protein properties to infer the molecular mechanisms of pathogenicity. The software package includes genetic and molecular data of amino acid substitutions leading to varied pathology. It includes a general pathological prediction and a ranked list of specific molecular alterations potentially affecting the phenotype (Pejaver et al. 2017).

In our study, two variants (W122L, I121N) were available in the dbSNP database with a frequency of < 0.01. Nine variants (R58T, Y73C, W122L, L170S, I255T, A188T, P280S, P426S, P445H) were available and were identified as "rare" (Minor Allele Frequency < 0.01) variations by the Exome Aggregation Consortium (ExAC) database (http://exac.broadinstitute.org/). This database lists a total of 53 missense variants with a constraint metric (z value) of 3.34. Positive Z scores indicate increased constraint (intolerance to variation) and therefore that the gene had fewer variants than expected.

The rare functional variant could alter gene function significantly though hit occurs at low frequency in a population. The “common-disease rare-variant” hypothesis indicates that variants affecting health are under purifying selection and thus should be found only at low frequencies in human populations. Rare variants are increasingly being studied, as a consequence of exome and whole-genome sequencing efforts. While these variants are individually infrequent in populations, there are many such variants in human populations, and they can be unique to specific populations. They are more likely to be deleterious than common variants, as a result of rapid population growth and weak purifying selection (Nelson et al. 2012).

Our overall results indicate that there is a significant number of nsSNP (14/89; 16%) predicted to be associated with GABRA2 protein dysfunction. Gene-gene interactions were studied to highlight candidate genes that could be associated with alcohol dependence, especially if haplotypes are to be studied in the future. Among the 14 nsSNPs, 9 were within conserved regions. The functionally deleterious nsSNPs, showed 10 nsSNPs to be associated with disease, though none of them showed structural variation in the Ramachandran Plot.

GABRA2 gene variants are associated with alcohol dependence and other mental disorders, but nsSNP of the GABRA2 gene has not been studied earlier. To our knowledge, this is the first report on the SNPs focusing primarily on exons associated with functional changes directed by missense variants in the gene. It could be speculated that these nsSNPs play a vital role in an individual's alcohol dependence. However, this should be further evaluated through studies on individuals with varying degrees of alcohol dependence in addition to Genome-Wide Association Studies (GWAS). A case-control study involving alcoholics compared to non-alcoholics with SNPs will help rule out type I error in observations (Ray et al. 2009). This data could help institute social intervention programs. With a better understanding of the genetic basis of alcoholism, it is possible to pre-screen 'at-risk' individuals and design personalized early intervention, especially among the youth population (counselling, change of peer groups, improve family support and ensure employment and domiciliary status if necessary).

References

Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–9.
Agrawal A, Edenberg HJ, Foroud T, Bierut LJ, Dunne G, Hinrichs AL, Nurnberger JI, Crowe R, Kuperman S, Schuckit MA, Begleiter H, Porjesz B, Dick DM. Association of GABRA2 with drug dependence in the collaborative study of the genetics of alcoholism sample. Behav Genet. 2006;36:640–50.
Agrawal A, Verweij KJ, Gillespie NA, Heath AC, Lessov-Schlaggar CN, Martin NG, Nelson EC, Slutske WS, Whitfield JB, Lynskey MT. The genetics of addiction-a translational perspective. Transl Psychiatry. 2012 Jul 17;2:e140.
Begleiter H, Porjesz B. Genetics of human brain oscillations. Int J Psychophysiol Off J Int Organ Psychophysiol. 2006;60:162–71.
Bierut LJ, Agrawal A, Bucholz KK, Doheny KF, Laurie C, Pugh E, Fisher S, Fox L, Howells W, Bertelsen S, Hinrichs AL, Almasy L, Breslau N, Culverhouse RC, Dick DM, Edenberg HJ, Foroud T, Grucza RA, Hatsukami D, Hesselbrock V, Johnson EO, Kramer J, Krueger RF, Kuperman S, Lynskey M, Mann K, Neuman RJ, Nöthen MM, Nurnberger JI Jr, Porjesz B, Ridinger M, Saccone NL, Saccone SF, Schuckit MA, Tischfield JA, Wang JC, Rietschel M, Goate AM, Rice JP; Gene, Environment Association Studies Consortium. A genome-wide association study of alcohol dependence. Proc Natl Acad Sci U S A. 2010;107:5082–7.
Birkley EL, Smith GT. Recent advances in understanding the personality underpinnings of impulsive behavior and their role in risk for addictive behaviors. Curr Drug Abuse Rev. 2011;4:215–27.
Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the Functional Effect of Amino Acid Substitutions and Indels. PLOS ONE. 2012;7(10):e46688.
Clarke T-K, Adams M J, Davies G, Howard D M, Hall L S, Padmanabhan S, Murray A D, Smith BH, Campbell A, Hayward C, Porteous DJ, Deary IJ, McIntosh AM. Genome-wide association study of alcohol consumption and genetic overlap with other health-related traits in UK Biobank (N=112 117). Mol Psychiatry. 2017; 22: 1376–1384
Covault J, Gelernter J, Hesselbrock V, Nellissery M, Kranzler HR. Allelic and haplotypic association of GABRA2 with alcohol dependence. Am J Med Genet Part B Neuropsychiatr Genet Off Publ Int Soc Psychiatr Genet. 2004;129B:104–9.
Dakal TC, Kala D, Dhiman G, Yadav V, Krokhotin A, Dokholyan NV. Predicting the functional consequences of non-synonymous single nucleotide polymorphisms in IL8 gene. Sci Rep. 2017;7:6525.
Edenberg HJ, Dick DM, Xuei X, Tian H, Almasy L, Bauer LO, Crowe RR, Goate A, Hesselbrock V, Jones K, Kwon J, Li TK, Nurnberger JI Jr, O'Connor SJ, Reich T, Rice J, Schuckit MA, Porjesz B, Foroud T, Begleiter H. Variations in GABRA2, Encoding the α2 Subunit of the GABAA Receptor, Are Associated with Alcohol Dependence and with Brain Oscillations. Am J Hum Genet. 2004;74:705–14.
Farren CK, Hill KP, Weiss RD. Bipolar Disorder and Alcohol Use Disorder: A review. Curr Psychiatry Rep. 2012;14:659–66.
Feldstein Ewing SW, LaChance HA, Bryan A, Hutchison KE. Do genetic and individual risk factors moderate the efficacy of motivational enhancement therapy? Drinking outcomes with an emerging adult sample. Addict Biol. 2009;14:356–65.
Gonzalez-Nunez V. Role ofgabra2, GABAAreceptor alpha-2 subunit, in CNS development. Biochem Biophys Rep. 2015;3:190–201.
Hassan MM, Omer SE, Khalf-allah RM, Mustafa RY, Ali IS, Mohamed SB. Bioinformatics Approach for Prediction of Functional Coding/Noncoding Simple Polymorphisms (SNPs/Indels) in Human BRAF Gene. Adv Bioinforma 2016;2016:2632917.
Huang YH, Liu HC, Tsai FJ, Sun FJ, Huang KY, Chiu YC, Huang YH, Huang YP, Liu SI. Correlation of impulsivity with self-harm and suicidal attempt: a community study of adolescents in Taiwan. BMJ Open. 2017;7:e017949.
Littlefield AK, Sher KJ, Wood PK. A personality-based description of maturing out of alcohol problems: extension with a five-factor model and robustness to modeling challenges. Addict Behav. 2010;35:948–54.
López-Ferrando V, Gazzo A, de la Cruz X, Orozco M, Gelpí JL. PMut: a web-based tool for the annotation of pathological variants on proteins, 2017 update. Nucleic Acids Res. 2017;45(W1):W222–8.
Mayfield RD, Harris RA, Schuckit MA. Genetic factors influencing alcohol dependence. Br J Pharmacol. 2008;154:275–87.
Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P, Verzilli C, Shen J, Tang Z, Bacanu SA, Fraser D, Warren L, Aponte J, Zawistowski M, Liu X, Zhang H, Zhang Y, Li J, Li Y, Li L, Woollard P, Topp S, Hall MD, Nangle K, Wang J, Abecasis G, Cardon LR, Zöllner S, Whittaker JC, Chissoe SL, Novembre J, Mooser V. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science. 2012;337(6090):100–4. s
Nusbaumer MR, Reiling DM. Environmental influences on alcohol consumption practices of alcoholic beverage servers. Am J Drug Alcohol Abuse. 2002;28:733–42.
Pejaver V, Mooney SD, Radivojac P. Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum Mutat. 2017;38:1092–108.
Ray LA, Hutchison KE. Associations among GABRG1, level of response to alcohol, and drinking behaviors. Alcohol Clin Exp Res. 2009 Aug;33(8):1382-90.
Schaefer C, Rost B. Predict impact of single amino acid change upon protein structure. BMC Genomics. 2012;13(Suppl 4):S4.
Soyka M, Preuss UW, Hesselbrock V, Zill P, Koller G, Bondy B. GABA-A2 receptor subunit gene (GABRA2) polymorphisms and risk for alcohol dependence. J Psychiatr Res. 2008;42:184–91.
Tang H, Thomas PD. PANTHER-PSEP: predicting disease-causing genetic variants using position-specific evolutionary preservation. Bioinforma Oxf Engl. 2016;32:2230–2.
Taylor EM, Murphy A, Boyapati V, Ersche KD, Flechais R, Kuchibatla S, McGonigle J, Metastasio A, Nestor L, Orban C, Passetti F, Paterson L, Smith D, Suckling J, Tait R, Lingford-Hughes AR, Robbins TW, Nutt DJ, Deakin JF, Elliott R; ICCAM Platform. Impulsivity in abstinent alcohol and polydrug dependence: a multidimensional approach. Psychopharmacology (Berl). 2016;233:1487–99.
Treutlein J, Cichon S, Ridinger M, Wodarz N, Soyka M, Zill P, Maier W, Moessner R, Gaebel W, Dahmen N, Fehr C, Scherbaum N, Steffens M, Ludwig KU, Frank J, Wichmann HE, Schreiber S, Dragano N, Sommer WH, Leonardi-Essmann F, Lourdusamy A, Gebicke-Haerter P, Wienker TF, Sullivan PF, Nöthen MM, Kiefer F, Spanagel R, Mann K, Rietschel M. Genome-wide association study of alcohol dependence. Arch Gen Psychiatry. 2009;66:773–84.
Villafuerte S, Strumba V, Stoltenberg SF, Zucker RA, Burmeister M. Impulsiveness mediates the association between GABRA2 SNPs and lifetime alcohol problems. Genes Brain Behav. 2013;12:525–31.
Way MJ, Ali MA, McQuillin A, Morgan MY. Genetic variants in ALDH1B1 and alcohol dependence risk in a British and Irish population: A bioinformatic and genetic study. PloS One. 2017;12:e0177009.

Tables

Table 1

Analysis of the 14 nsSNPs identified in terms of nature of mutation, energy minimization and structural integrity of the wild and mutation-predicted model.

AA variant	Variant ID	I-MUTANT			Swiss-PDBViewer		RAMPAGE (No. of residues)
AA variant	Variant ID	I-Mutant Disease	RI	DDG Value (kcal/mol)	Total energy after energy minimization (KJ/mol)	RMSD (Å)	Favored	Allowed	Outlier
Wild	-	-	-	-	-19946.123	0.00	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
T43A	rs41305781	Disease	8	-1.33	-13702.660	0.000	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
R58T	rs765251624	Disease	3	0.23	-19646.230	0.000	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
Y73C	rs753040126	Disease	5	-0.98	-13652.485	0.000	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
F93C	rs199725032	Disease	7	-2.34	-13680.411	0.029	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
I121N	rs749035438	Disease	6	-0.57	-13857.964	0.059	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
W122L	rs775541780	Disease	8	-0.49	-13567.015	0.081	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
S186C	rs373038663	Disease	4	-0.47	-13630.729	0.150	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%
A188T	Rs768736908	Disease	8	-1.05	-13724.141	0.023	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
V208F	rs752066816	Neutral	2	-0.46	-13739.968	0.033	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
I255T	rs767460850	Unknown	-	-	-13738.369	0.055	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
P280S	rs771481457	Disease	8	-2.01	-13763.594	0.057	340 ( 75.7%)	66 ( 14.7%)	43 ( 9.6%)
P426S	rs754301188	Neutral	9	-0.86	-13749.602	0.067	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
P445H	rs761715134	Disease	10	-2.48	-13763.185	0.065	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
L170S	rs749473765	Neutral	7	-0.74	-13684.243	0.047	339 ( 75.5%)	67 ( 14.9%)	43 ( 9.6%)
RI - reliability index The RI value (Reliability Index) is computed only when the sign of the stability change is predicted DDG Value - Free energy change value The DDG value is calculated from the unfolding Gibbs free energy value of the mutated protein minus the unfolding Gibbs free energy value of the wild type (Kcal/mol). RMSD - Root Mean Square Deviation

Table 2

Molecular changes and prediction by Mutpred server

AA variation	Molecular mechanisms	g-score	P-score
T43A	Altered Transmembrane protein***	0.874	2.2e-03
T43A	Gain of N-linked glycosylation at N38***	0.874	6.7e-03
R58T		0.945
	Loss of Allosteric site at R58***		1.8e-04
	Altered Metal binding***		3.8e-03
	Loss of Relative solvent accessibility**		0.01
	Altered Disordered interface**		0.04
	Altered Ordered interface**		0.01
	Altered Transmembrane protein***		2.0e-03
	Altered DNA binding**		0.01
	Loss of Catalytic site at R58**		0.04
	Gain of Sulfation at Y53**		0.04

Y73C	Altered Disordered interface***	0.895	2.0e-03
	Altered Ordered interface***		1.6e-03
	Altered Transmembrane protein***		2.7e-03

F93C	Altered Ordered interface***	0.936	9.5e-03
	Altered Metal binding**		0.02
	Altered DNA binding**		0.01
	Gain of Allosteric site at D90*		0.05
	Altered Transmembrane protein**		0.01
	Gain of Pyrrolidone carboxylic acid at Q95**		0.03

I121N	Loss of Allosteric site at W122***	0.954	7.5e-03
	Altered Ordered interface**		0.04
	Altered Transmembrane protein***		2.7e-03
	Altered Metal binding**		0.01
	Gain of Ubiquitylation at K120**		0.04
	Altered Stability**		0.03

W122L	Loss of Allosteric site at W122***	0.937	1.5e-03
	Altered Ordered interface***		5.3e-03
	Loss of Strand**		0.02
	Altered Metal binding**		0.02
	Altered Transmembrane protein***		8.6e-03
	Gain of Ubiquitylation at K120**		0.04

L170S	Gain of Intrinsic disorder**	0.935	0.01
	Altered Ordered interface**		0.01
	Altered Metal binding***		5.2e-03
	Altered Transmembrane protein***		9.6e-04

S186C	Altered Transmembrane protein***	0.855	5.4e-05
S186C	Altered Ordered interface***	0.855	8.2e-03

A188T	Altered Transmembrane protein**	0.745	2.9e-05
	Loss of Relative solvent accessibility*		0.03
	Altered Stability*		0.02

V208F	Altered Transmembrane protein***	0.896	2.9e-05
	Loss of Relative solvent accessibility**		0.03
	Altered Stability**		0.02

I255T	Altered Transmembrane protein***	0.880	9.7e-06
	Altered Ordered interface**		0.02
	Altered Stability**		0.04

P280S	Altered Transmembrane protein***	0.899	4.2e-04
	Gain of Relative solvent accessibility**		0.03
	Altered Metal binding**		0.04

P426S	Altered Ordered interface**	0.797	0.05
	Loss of Allosteric site at R422**		0.02
	Altered Transmembrane protein***		3.9e-03
	Altered DNA binding**		0.02

P445H	Altered Metal binding***	0.825	3.3e-03
	Altered Ordered interface***		4.8e-03
	Loss of Loop**		0.04
	Altered Transmembrane protein***		1.2e-03
	Loss of Allosteric site at Y440**		0.03
Certain combinations of high values of general scores and low values of property scores are referred to as hypotheses Scores with g-value > 0.5 and p-value < 0.05 are referred to as actionable hypotheses () Scores with g-value > 0.75 and p-value < 0.05 are referred to as confident hypotheses () Scores with Scores with g > 0.75 and p < 0.01 are referred to as very confident hypotheses. (**) The output of MutPred contains a general score (g), i.e., the probability that the amino acid substitution is deleterious/disease-associated, and top 5 property scores (p), where p is the P-value that certain structural and functional properties are impacted.

In Silico Prediction of Deleterious Non-Synonymous SNPs of Human GABRA2 gene and Altered Protein Structure and Function – A Link to Alcohol Dependence?

Abstract

Introduction