In humans, neurodegenerative illnesses are now fairly common. Several factors like hereditary and environmental influences, and so forth contribute to the development of these disorders. A few studies have demonstrated the impact of mutation in gene expression change and disease development (Armstrong et al. 2019; Gerstung et al. 2015). Recent research has primarily focused on neurodegenerative disorders such as Alzheimer's and Parkinson's, but no work has been published on the pattern of CUB in epilepsy-related genes. Codon usage bias is a common phenomenon that can be found in a wide range of species, from prokaryotes to eukaryotes. It is a multifaceted phenomenon impacted mostly by nucleotide composition (Zhang et al. 2011) and natural selection (Yannai et al. 2018). Apart from these, several other factors influence CUB, for example, environmental factors, tRNA availability etc. (Behura and Severson 2013; Salim and Cavalcanti 2008). The findings of our study would shed light on the molecular intricacies of critical genes linked to epilepsy.
Genomic composition modulates the overall codon usage and amino acid usage pattern recently confirmed in coronavirus genomes (Tort et al. 2020). In mammals, GC content and CpG islands actively regulate the transcription of genes via chromatin conformation (Fenouil et al. 2012). The epilepsy-related genes had an overall GC of 53%, which might have influenced gene expression as seen in our CAI analysis (avg. CAI = 0.77). Individual nucleotide composition analysis at three codon positions showed the highest use of G at the synonymous position. Which is prone to methylation. Existing reports suggest that the nucleotide composition of a gene at a synonymous position follows overall genomic composition due to positive section pressure (Kryazhimskiy et al. 2008). Our findings on nucleotide composition corroborated well with these reports which were further validated by SCUO vs GC plot (Fig. 2) and the correlation observed between genomic GC and GC3s.
RSCU analysis showed the top most used 11 codons (RSCU > 1.6) are either G or C ending. These over-represented codons are GTG (1.936), ATC (2.004), TTC (2.196), GAC (2.228), TGC (2.252), AAC (2.288), TAC (2.324), GAG (2.38), AAG (2.428), CAC (2.472), and CAG (2.984). Similarly, Uddin et al. showed that codons CTG, CAG, TCC, AGC, CGC, ATC, GCC, GGC, ACC, GTG, and CGG are overrepresented in genes related to the central nervous system (Uddin and Chakraborty 2019). These findings are comparable to those of Jie Yang (Yang et al. 2010), who studied codon usage bias in genes related to Alzheimer's disease and discovered that G/C-ending codons were substantially favored, as well as the fact that most optimal codons had either G or C at the 3rd codon position. This suggests that genes linked to neurological disorders have a distinct codon usage pattern. Furthermore, the relationship between AT/GC at synonymous positions and codon usage (Fig. 4) revealed that G/C-ending codons are positively connected to compositional patterns and vice versa by A/T-ending codons. As a result, our findings show that GC composition has a considerable impact on the CUB found in the epilepsy-related genes investigated. Here, the mean ENC observed was 48.25 indicating a moderate CUB. Proopiomelanocortin (ENc = 31.28), ADRA2A (ENc = 32), and OPRD1 (ENc = 32.08) gene showed the highest CUB. A previous study on genes related to central nervous systems showed an average ENc of 40.53 (Uddin and Chakraborty 2019). The ENC values for the albumin superfamily were observed to vary from 51.56 to 56.62 indicating low codon usage bias (Mirsafian et al. 2014). Here, except for 14 genes, all other genes showed an ENc value greater than 40. The high ENc value observed in epilepsy-related genes might be related to the availability of tRNAs maintained by cells expressing these genes for error-free replication and final protein synthesis (Jenkins and Holmes 2003; Kanaya et al. 2001; Novoa and de Pouplana 2012). Furthermore, mRNA stability may be connected to the ENc pattern adopted by epilepsy-associated genes (Burow et al. 2018). The negative relationship between ENc and CAI (r= -0.749**, p < 0.01) suggests that codon usage has a significant influence on epilepsy linked genes expression.
PR2 plot analysis confirmed the G/A nucleotide dominance at the wobble position of epilepsy-related genes. Further, it can be observed in Fig. 5 that genes are clustered in a certain area of the plot, implying a significant degree of selection pressure on the genes under study. However, a few genes were found in the PR2 plot's central area, confirming the role of mutation pressure as previously reported (Rahman et al. 2022). The neutrality plot revealed a small range of GC3s and a regression slope that was close to zero (Fig. 6). This demonstrates that selection pressure has a 62% role in influencing the CUB. According to our PR2 plot and neutrality plot analyses, epileptic genes were subjected to significant selection pressure and modest mutation pressure. Axis 1 from correspondence analysis showed the highest significant correlations with gene expression and codon usage bias of genes. Moreover, the distribution of synonymous codons and genes was found to be in line with axis 2. Taken together, it's conceivable that genes connected to the central nervous system, notably epilepsy, are under great selection pressure.