Phylogenetic characterization and evolution of the VP4 gene of P[9] rotaviruses

Objectives: Rotavirus is one of the major causes of gastroenteritis in children under 5 years of age and is responsible for over 200,000 deaths annually. Rotavirus can evolve by reassortment, in which gene segments are exchanged between strains of different origins. Rotavirus strains with the P[9] genotype is an example of reassortment, in which the P[9] genotype is from feline species. A number of outbreaks by P[9] strains have been documented in several countries. However, details regarding the epidemiological relationships between the strains remains largly unknown. Therefore, in the present study, genetic characterization and evolutionary analyses were perforemd to gain insight into P[9] strains circulating in different parts of the world. Results: The P[9] strains could be divided into ve lineages, and that the common ancestor of currently circulating P[9] strains is around 168 years old. In each lineage, the strains were not only from different countries, but also from different continents. These ndings suggest that none of the lineages has a specic region of distribution, and although humans have had interactions with cats for thousands of years, the ancestor of the current P[9] strain is relatively recent.


Introduction
Rotavirus is a major cause of gastroenteritis and is responsible for over 200,000 deaths annually in children under 5 years of age [1]. Rotavirus has a double-stranded RNA genome divided into 11 segments, encoding six structural (VP1-VP4, VP6, and VP7) and six nonstructural proteins (NSP 1-6) [2]. A binary classi cation system developed on the basis of two outer capsid proteins, VP7 and VP4, using G and P genotypes is used to identify different strains of group A rotavirus. To date 35 G genotypes and 50 P genotypes have been found [3]. Among the numerous G/P-genotype combinations, only a limited number are commonly found in human infections; these are G1P [8], G2P [4], G3P [8], G4P [8], and G9P [8].
Based on limited studies, the evolutionary patterns of the human P [9] rotaviruses appear to be complex [22]. Therefore, the present study was performed to determine the genetic relationships among the P [9] strains circulating in different countries, and their evolutionary timelines.

Phylogenetic analyses
A total of 88 full-and partial-length VP4 gene nucleotide sequences of P [9] strains were extracted from GenBank (Additional le 1). Phylogenetic analyses were conducted with the neighbour-joining method using MEGA after aligning the nucleotide sequences using CLUSTAL W [24]. The branching patterns were evaluated based on a bootstrap analysis of 1,000 replicates. In all of the phylogenetic trees, lineages were designated based on signi cant bootstrap values of 70%. Several phylogenetic trees were constructed using full-length gene and partial-length genes of different lengths (Additional le 2).

Nucleotide identity
The nucleotide identities of the full-and partial-length P [9] genes of different lineages were compared with BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi) .

Timeline and evolution
Evolutionary analysis was performed using the full-length nucleotide sequences of VP4 genes. We inferred a maximum clade credibility phylogenetic tree using the Bayesian Markov Chain Monte Carlo method available in BEAST version 1.6.1 [25]. The nucleotide sequences were analyzed by using a relaxed molecular clock (uncorrelated lognormal) and general time-reverse model (GTR + I model). The sequences were run for 20 million generations and sampled at every 2,000 steps. The end result was a sample size of 1,000 Bayesian trees, which was then veri ed for convergence by Tracer version 1.5.
Lineage IV consisted of strains from the USA, Italy, and Russia; all were G3P [9]. Lineage V consisted of strains from Paraguay, Brazil, Italy and Thailand. All of the strains in lineage V were of the G12P [9] combination.
A total of 11 phylogenetic trees were constructed using different partial-length nucleotide sequences (Additional le 2) of P [9] strains. All of these could be classi ed into ve lineages. The distribution of strains in each lineage was consistent with the phylogenetic tree constructed using full-and partial-length gene sequences.

Nucleotide identity
Comparision of nucleotide identities of 39 full-length and 88 partial-length nucleotide sequences of the outer capsid protein VP4 gene among the ve lineages of rotavirus P [9] are shown in additional le 4 and 5, respectively. P [9] strains of the same lineage shared close nucleotide identity among themselves. The strains of different lineages showed a decrease in nucleotide identity, except for lineage V, which shared relatively low nucleotide identity with strains of other lineages.

Timeline of evolution
The phylogenetic tree constructed using the Bayesian method also showed ve lineages of P[9] strains (Fig. 3), which corresponded with the lineages as determined using the neighborur-joining method.

Discussion
All currently circulating P [9] strains were divided into ve lineages, and the strains in each lineage were from multiple countries on different continents. This suggests that strains in a lineage do not belong to a speci c geographical area. This might exemplify the role of human migration in the spread of strains in different countries. Humans tend to bring their accompanying animals, which might include domestic cats, during migration, and as human migration is a continuous process, this trend is expected to continue in future.
The results of this study con rmed sequences of the VP4 gene as short as 555 nt were adequate for lineage designation. For new strains, it is still recommended where possible to use full-length nucleotide sequences for phylogenetic analysis to provide a more robust and comprehensive for analysis of the evolution, spread, and genome-wide heterogeneity of a given virus. In the present study, full-length nucleotide sequences were used during timeline evolutionary analysis because different portions of the gene have different rates of evolution, which might affect the determination of lineage age.
Lineage I contained the AU-1 strain (G3P [9]), which was the rst detected P [9] strain in humans [4]. The Brazilian strains were from several outbreaks [20]; these strains were also clustered in this lineage and shared high nucleotide identity with AU-1 [26]. All Paraguayan P [9] strains of lineage I were detected in the same year; however, the P [9] genotype was in combination with different G genotypes. By contrast, the Paraguayan strains of lineage V were all G12P [9]. These were considered emerging strains [18], suggesting the possibility of outbreak. The results of the present study support the nding that Paraguayan strains share 99% nucleotide identity with T152, another G12P [9] rotavirus strain discovered in Thailand [9]. However, the underlying mechanism for the speci c combination of the P[9] from lineage V with G12 rather than G3 requires further study.
When full nucleotide sequences were compared, P [9] strains of the same lineage shared close nucleotide identity among themselves. When strains of different lineages were compared, a decrease in nucleotide identity was observed, except for lineage V, which shared relatively low nucleotide identity with strains of other lineages. All of the strains in lineage V were G12P [9]; the signi cance of this genotype combination on nucleotide identity requires further study. Also, eight of 12 strains in lineage V were from South America, which suggests the possibility of a single clone spreading across the continent. Few differences were seen when the identities of partial-length sequences were compared with those of full-length sequences. Such differences in nucleotide identity using partial-length nucleotide sequences might be acceptable when no full-length sequences are available.
The information obtained in this study indicates that the origin of the common ancestor of currently circulating P [9] rotavirus strains was approximately 168 years ago, which might be too recent, as humans have interacted with cats for several thousand years. We postulate that there could have been several rotavirus transmission events from cats to humans; however, older strains might have been wiped out by evolutionary constraints, and the currently circulating strains that evolved from 168-year-old ancestor could have survived and dispersed in different places with further local evolution. The Bayesian tree showed that the rate of evolution among the strains was signi cantly higher in the last few decades. This is possibly because of increased human-animal interaction in recent decades. It is possible that the current strains might evolve further and give rise to virulent strains of rotavirus.
Human-animal interaction has increased in recent years for several reasons, such as more humans having pets, a loss of animal habitats because of deforestation, and increase in large-scale farming and ecotourism. As a result, the potential for zoonotic transmission of viruses has increased, which could lay the foundation for the emergence of reassorted strains. The development of a common P [9] vaccine for humans and cats might help control rotavirus infection by this reassorted strain.
We conclude that the available P [9] strains could be divided into ve lineages. VP4 gene as short as 555 nt were adequate for lineage designation. Although humans have had interactions with cats for thousands of years, the common ancestor of the current P [9] strain is relatively recent, around 168 years old. The rate of evolution among the strains was signi cantly higher in the last few decades. This is possibly because of increased human-animal interaction in recent decades. In support of this, we also found that none of the lineages has a speci c region of distribution.

Limitations
Further study is needed particulalry using neocleotide sequences of P [9] strains from cats to evaluate the time-line of evolution of P [9] strains.

Declarations
Ethics approval and consent to participate Not applicable Consent to publish Not applicable Availability of data and materials The datasets used and/or analyzed during the current study are available in the GenBank. These are also available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests. Figure 1 Phylogenetic tree constructed based on the full nucleotide sequences of the outer capsid protein VP4 genes of P [9] rotavirus strains. The numbers adjacent to the nodes represent the bootstrap values; values <70% are not shown. The scale bar shows genetic distance, which is expressed as nucleotide substitution per site.

Figure 2
Phylogenetic tree constructed based on the partial nucleotide sequences of the outer capsid protein VP4 genes of P [9] rotavirus strains, with a length of 555 nucleotides. The numbers adjacent to the nodes represent the bootstrap values; values <70% are not shown. The scale bar shows genetic distance, which is expressed as nucleotide substitution per site.