The NiV is an emerging bat-borne pathogen that causes severe respiratory and neurological disease with high mortality. It can spread in the population through infected people or infected animals. Based on epidemiological distribution, different strains of the virus with differing clinical features have been reported [1]. Information of factors influencing codon usage bias and its intensity are important to know detail about viral evolution and its transmission. Previously, Khandia et al. [3]and Chakraborty et al. [12] studied the NiV codon usage pattern and its influencing factors. In both studies, RSCU values (a major indicator of codon usage bias) and codon usage patterns for the complete genome were evaluated. But, it has been reported that virus with different open reading frames (ORFs) has varied codon usage patterns for different genes [10, 11].
The NiV genome (single-stranded negative-sense RNA; 18.2-kb size) has six genes/ORFs which encoded nine different proteins (N, P, F, G, W, V, C, M, and L proteins) [5]. As, the N gene encodes for viral N protein, which is the most abundant protein among all structural proteins of NiV [6], and relative abundance of N protein is a major controlling factor for genome encapsidation, replicase activity, and regulating viral RNA synthesis [7], the current study was focused to understand the codon usage bias of the N gene of NiV using multiple systemic analytical methodologies. We calculated and compared RSCU values for each synonymous codon of the N gene of various NiV isolates and its hosts (Human, Pig, and Bat). Comparison of the preferred codon of each amino acid of viral N gene and its host indicated 2 preferred codons [Ile (AUC) and Arg (AGA)] were common between virus and humans whereas 5 preferred codons [Ile (AUC), Pro (CCA), Glu (GAA), Arg (AGA), Gly (GGA)] were common between virus and pig, and only 1 preferred codon [Ile (AUC)] was common between virus and bat. RSCU comparison of viral N gene and its host indicates the presence of codon usage bias in NiV.
The ENC is a simple indicator of codon bias. Earlier, ENC values of various RNA viruses have been determined like Japanese encephalitis virus (mean ENC = 55.30) [9] Zika virus (mean ENC = 52.72) [8], and chikungunya viruses (mean ENC = 55.56) [21]. The higher (more than 45) ENC value is an indicator of week codon usage bias [8, 9]. In the current study, the mean ENC value for the N gene of NiV isolates was 50.98 indicating low codon usage bias in NiV. The virus having low codon usage bias can use multiple codons for each amino acid which allows viral replication more efficiently in the host cell [22].
In several RNA viruses, mutational pressure and natural selection are two key forces that determine codon usage bias [21]. If mutational pressure is the only factor determining codon usage bias, during the ENC versus GC3 analysis, all data points reflecting ENC values should lie on the expected curve [16]. In this study, the data points were found below the predicted curve. This indicated that other factors also influence codon usage bias of the N gene of NiV in addition to the mutational pressure. The effect of mutational pressure on codon usage bias was supported by substantial correlations between total nucleotide overall nucleotide contents and A3s, U3s, C3s, and G3s. The significant correlation between ENC values and whole nucleotide contents (except %G) confirmed the involvement of mutational pressure. The first and second axes values of CA were also significantly correlated with whole nucleotide content. All of the above findings indicate that mutational pressure is a significant factor influencing the codon usage bias of the N gene of NiV.
Natural selection may also alter codon usage patterns during the virus's adaptation to host cells [8, 9]. Strong correlations between Gravy values with GC3s, G3s, C3s, A3s, U3s, and ENC values were observed in current studies, indicating that viral protein characteristic has also been responsible for the observed variation in NiV codon usage. High CAI values of N gene of NiV isolates in comparison to its host (human, pig, and bat) indicated the effect of natural selection on codon usage bias. Moreover, CAI values were higher than eCAI values in respective hosts also indicated the significant adaptation of the virus to their hosts be due to natural selection.
In many RNA viruses, geographical dispersion and evolutionary processes also contribute to codon usage bias [8, 9]. In this study, geographical distribution based on CA and phylogenetic analysis were used to investigate the effects of geographical dispersion and evolutionary processes on codon usage, respectively. During CA, two different clades were formed. All Malaysian and Cambodian isolates fell into Cluster-A and Indian and Bangladeshi isolates fell into Cluster-B, while isolates from Thailand were distributed in both cluster-A and Cluster-B. This distribution of area-specific NiV isolates in specific clusters of CA graph indicated the role of geographical distribution on codon usage bias. In the phylogenetic tree, two clades were observed and distributions of NiV isolates were similar to distributions of isolates in CA graph. Similar patterns of clustering during CA and clade formation in phylogenetic tree supported the role of evolutionary processes on codon usage bias in NiV.
The current study indicated low codon usage bias in NiV. Mutational pressure and natural selection were found to be two key factors impelling codon usage bias. In addition to mutational pressure and natural selection, geographical distribution and evolutionary processes were also influencing codon usage bias, to some extent.