The fast evolutionary rate, selection pressure, and recombination act as a prodigious evolutionary forces to intensify the genetic diversity in the norovirus . Owing to smaller genome sizes, higher mutation rates, short generation time, and large population sizes, the RNA viruses are suitable models to study evolution under the conceptual perception of population genetics. Prior studies are reported by focusing only on specific NoV genotypes or part of the genome from a specific continental region [1, 51]. The current study pursued a genome-wide comprehensive analysis of NoV GII isolates from different continental regions to gain a better understanding of their genetic structure, recombination events, and natural selection pattern.
The genetic structure analyses in the current study identified no geographical based distinction among the NoV GII isolates. Due to high mobility and traveling in the modern world, the NoV GII isolates might have disseminated worldwide, and hence no continental-specific distinction remained among the GII isolates. The genetic structure analysis unveiled the genotype GII.4 samples distinction from the rest of NoV GII genotypes (Figure 1). The stratification of entire NoV GII samples into two main subpopulations was also strengthened by the branching pattern of a phylogenetic tree and PCA analysis (Figure 3 & 4). This is somehow contrary to the findings of Kobayshi et al. (2016), where the NoV samples were stratified into three main populations based on OFR1, and the genotype GII.4 samples were reported to cluster with GII.15 and GII.20 genotypes . The complete genome sequences based analyses pursued in the current study unveiled a clear distinction of GII.4 compare to the rest of the GII genotypes, including the GII.15 and GII.20. Moreover, additional analyses of GII.4 isolates sequences suggested extra clustering at K=2 and 5 (Figure 2C). At K=5, the GII.4 sydney_2012 variants stratified into three lineages. Such stratification pattern of the Sydney_2012 variant is also reported earlier based on ORF2 gene sequences .
The current study identified admixture strains using the admixture model/linkage model implemented in the STRUCTURE program. The admixture model fails to take into account the physical relation between loci, and the proportion of admixed strains may sometimes be under or over-estimated. Therefore, to optimize the membership scores given to the admixed strains linkage correlated model was applied that report for potential linkage. The admixed isolates were observed in the C-1.1b, C-1.2b, C-1.3b, and C-2.1a clusters. The majority of the admixed and recombinant strains belong to the non-GII.4 genotypes. Few of these admixed strains are reported to be globally prevalent such as GII.Pb/GII.3, GII.Pb/GII.13, and GII.Pg/GII.12 . Recombination among the NoV strains occurs at high frequency and acts as a major driving force of viral evolution. Recombination allows the virus to increase its genetic fitness, evolve, and spread in the host population by escaping the host immune response . The admixture in NoV is possibly responsible for the genetic diversification of C-1.2b and C-1.3b clusters. Likewise, the |D'|, r2 & ISA statistics inferred poor linkage evidence for norovirus GII isolates in the current study and indirectly justifying the role of recombination to shape the Norovirus GII isolates evolution.
The steady BSP plot generated on the basis of complete genome markers, speculate predominantly a stable effective population size for the NoV GII isolates originated from the Human host (Figure 5). The sharp decrease in effective population size of NoV in 2003 might be caused by the introduction of GII.4 as a new variant. In 2002-03, a marked increase in NoV infection was reported in England and other countries due to the emergence of GII.4 Farmington Hills and b4s6 variants . The BSP plot also inferred a rapid increase in the effective population size during 2009-10. This might be accompanied by the large outbreaks and epidemicity of GII.4 New Orleans_2009 variant . Likewise, a novel GII.12 strain also emerged during this period and caused several outbreaks . The effective population size fells sharply in 2015 that may likely correspond to the gain of host immunity against the dominant NoV variants infection.
The substantial signals of episodic diversifying selection were observed across all the proteins, including both the structural and non-structural proteins. However, limited pervasive positive selection signals identified for NoV GII samples at the VP1 and VPG genes. Although Xingguang et al., 2021, formerly reported no episodic positive selection signals detection for the genotype GII.2 isolates and speculated the genetic drift as a possible mechanism for NoV GII.2 evolution . However, in the current study, significant positive selection signatures were identified for the GII.2 strains (Table S4, S5). This speculates the selection pressure as a possible driven force accompanied with the GII.2 evolution. Several other studies also reported small numbers of positive selection sites in the VP1 protein of NoV GII isolates [52, 60]. The VP1 protein plays a fundamental role in the interaction of NoVs with the host cell and considered to be a key site for immune recognition and receptor binding. Therefore, this protein might possibly be a potential target for vaccine development . We observed several sites under the positive selection in both the P1 and P2 as well as the Shell domain of VP1 protein. The mutation at positions 282 to 395 of VP1 (Table S5) is a part of its P2 domain and this region reported to play an important role in the interaction with the human blood group antigen (HBGA) . The S domain is mainly conserved across different genotypes and mapping antigenic sites across this domain are mostly cross-reactive . Besides positive selection, a large number of sites were under the influence of negative selection and signifying a scenario of purifying selection. In general, positive selection sites may be responsible for the immune pressure leading to an escape mutation, and negative selection sites may prevent deterioration of antigenic function and structures . The sites under positive selection could provide markers for vaccine designing. The negatively selected sites identified in NoV GII genes may worthy to identify the highly conserved regions useful to implement new diagnostic protocols . A marked differentiation was observed in the positive selection signatures pattern in the GII.4 samples compare to the rest of the GII genotypes, which might have shaped the differential genetic composition of the GII.4 genotype, as identified in the current analyses.