Phylogenetic and evolutionary analyses of the VP4 gene of P[9] rotaviruses

Objectives: Rotavirus is one of the major causes of gastroenteritis in children under 5 years of age and is responsible for over 200,000 deaths annually. Rotavirus can evolve by reassortment, in which gene segments are exchanged between strains of different origins. Rotavirus strains with the P[9] genotype is an example of reassortment, in which the P[9] genotype is from feline species. A number of outbreaks associated with P[9] strains have been documented in several countries. However, details regarding the epidemiological relationships between the strains remains largely unknown. Therefore, in the present study, genetic characterization and evolutionary analyses were performed to gain insight into P[9] strains circulating in different parts of the world. Results: The VP4 gene of the P[9] strains could be divided into six lineages, and P[9] strains characterized in this study share a common ancestor that circulated in circa 1864. In each lineage, the strains were not only from different countries, but also from different continents. These ndings suggest that none of the lineages has a specic region of distribution, and although humans have had interactions with cats for thousands of years, the common ancestor of the VP4 gene of the current P[9] strains is relatively recent.


Introduction
Rotavirus is a major cause of gastroenteritis and is responsible for over 200,000 deaths annually in children under 5 years of age [1]. Rotavirus has a double-stranded RNA genome divided into 11 segments, encoding six structural (VP1-VP4, VP6, and VP7) and six nonstructural proteins (NSP 1-6) [2]. A binary classi cation system developed on the basis of two outer capsid proteins, VP7 and VP4, using G and P genotypes is used to identify different strains of group A rotavirus. To date 36 G genotypes and 51 P genotypes have been found (https://rega.kuleuven.be/cev/viralmetagenomics/virus-classi cation/rcwg).
One of the mechanisms by which rotavirus evolves is reassortment, where segments of genes are exchanged between strains of different origin. Among the known reassortment strains,G3P [9] is the least studied, but has become prominent, where the VP7 G3 gene often detected in associated with P [9] is feline-like as well in many reports [4]. The common G genotypes in combination with P [9] are G3, G6, G1, and G12 [5][6][7][8][9][10][11]. The incidence of P [9] strains causing infection in humans is relatively low, at about 2.5% worldwide [12]. Despite this low incidence, the prevalence of cases with P [9] strains is higher in several countries [13][14][15][16][17][18][19][20][21][22][23]. In Brazil and Ireland, around 10% and 18% of the patients infected with rotavirus had P [9] strains [21,24]. The other point of concern is that P [9] strains might become more virulent over time with multiple reassortments have the potential to cause outbreaks in other areas.
Based on limited studies, the evolutionary patterns of the human P [9] rotaviruses appear to be complex [23]. Therefore, the present study was performed to determine the genetic relationships among the VP4 gene of P [9] strains circulating in different countries, and their evolutionary timelines.

Phylogenetic analyses
A total of 94 full-and partial-length VP4 gene nucleotide sequences of P [9] strains were extracted from GenBank (Additional le 1). Phylogenetic analyses were conducted with the maximum likelihood method using MEGA X [25] after aligning the nucleotide sequences using CLUSTAL W [26]. The branching patterns were evaluated based on a bootstrap analysis of 1,000 replicates. In all of the phylogenetic trees, lineages were designated based on signi cant bootstrap values of >70%. Several phylogenetic trees were constructed using full-length gene and partial-length genes of different lengths (Additional le 2).

Nucleotide identity
The nucleotide identities of the full-and partial-length P [9] genes of different lineages were calculated by online software (www.bioinformatics.org).

Timeline and evolution
Evolutionary analysis was performed using the full-length nucleotide sequences of VP4 genes. We inferred a maximum clade credibility phylogenetic tree using the Bayesian Markov Chain Monte Carlo method available in BEAST version 1.6.1 [27]. The nucleotide sequences were analyzed by using a relaxed molecular clock (uncorrelated lognormal) and general time-reverse model (GTR+I+Γ model). The sequences were run for 60 million generations and sampled at every 3,000 steps. The end result was a sample size of 2,000 Bayesian trees, which was then veri ed for convergence by Tracer version 1.5.
Paraguayan strains were of genotypes G1P [9], G3P [9], G3G4P [9], and G12P [9]. Strains from other countries were of genotype G3P [9] (Additional le 3). Lineage Ib consisted of strains from Italy, Australia, and Japan. All these strains were of genotype G3P [9]. Lineage Ic consisted of G3P [9] and G6P [9] strains from Russia, Japan, Italy, Korea, Hungary, and Tunisia. Among these, strains from Russia, Korea, Hungary, and Italy were G3P [9], while strains from Tunisia and Japan were G6P [9]. Lineage Id consisted of strains from the USA, and Italy; all were G3P [9]. Lineage Ie consisted of two strains from Lebanon of G3P [9]. Lineage II consisted of strains from Paraguay, Brazil, Italy and Thailand. All of the strains in lineage II were of the G12P [9] combination.
A total of 11 phylogenetic trees were constructed using different partial-length nucleotide sequences (Additional le 2) of P [9] strains. From partial-length nucleotide sequence of 837 and above consistently generated trees which could be classi ed into six lineages. The distribution of strains in each lineage was consistent with the phylogenetic tree constructed using full-and partial-length gene sequences.

Nucleotide identity
Comparison of nucleotide identities of 43 full-length and 57 partial-length nucleotide sequences of the outer capsid protein VP4 gene among the six lineages of rotavirus P [9] are shown in additional le 4 and 5, respectively.

Timeline of evolution
The phylogenetic tree constructed using the Bayesian method also showed six lineages of P[9] strains (Fig. 3), which corresponded with the lineages as determined using the maximum likelihood method. All P [9]

Discussion
All currently circulating P [9] strains were divided into six lineages, except for lineage Ie, the strains in each lineage were from multiple countries on different continents. This suggests that strains in a lineage do not belong to a speci c geographical area. This might exemplify the role of human migration in the spread of strains in different countries. Humans tend to bring their accompanying animals, which might include domestic cats, during migration, and as human migration is a continuous process, this trend is expected to continue in future.
The results of this study con rmed sequences of the VP4 gene as short as 837 nt were adequate for lineage designation. For new strains, it is still recommended where possible to use full-length nucleotide sequences for phylogenetic analysis to provide a more robust and comprehensive for analysis of the evolution, spread, and genome-wide heterogeneity of a given virus. In the present study, full-length nucleotide sequences were used during timeline evolutionary analysis because different portions of the gene have different rates of evolution, which might affect the determination of lineage age.
Lineage Ia contained the AU-1 strain (G3P [9]), which was the rst detected P [9] strain in humans [9]. The Brazilian strains were from several outbreaks [21]; these strains were also clustered in this lineage and shared high nucleotide identity with AU-1 [28]. All Paraguayan P [9] strains of lineage Ia were detected in the same year; however, the P [9] genotype was in combination with different G genotypes. By contrast, the Paraguayan strains of lineage II were all G12P [9]. These were considered emerging strains [19], suggesting the possibility of outbreak. The results of the present study support the nding that Paraguayan strains share 99% nucleotide identity with T152, another G12P [9] rotavirus strain discovered in Thailand [10]. However, the underlying mechanism for the speci c combination of the P [9] from lineage II with G12 rather than G3 requires further study.
When full nucleotide sequences were compared, the P[9] strains of the same lineage shared close nucleotide identity among themselves. When strains of different lineages were compared, a decrease in nucleotide identity was observed, except for lineage II, which shared relatively low nucleotide identity with strains of other lineages. All of the strains in lineage II were G12P [9]; the signi cance of this genotype combination on nucleotide identity requires further study. Also, ve of eight strains in lineage II were from South America and formed a cluster, which suggests the possibility of strains descended from a single ancestor are spreading across the continent. Few differences were seen when the identities of partiallength sequences were compared with those of full-length sequences. Such differences in nucleotide identity using partial-length nucleotide sequences might be acceptable when no full-length sequences are available.
The information obtained in this study indicates that the origin of the common ancestor of currently circulating P [9] rotavirus strains might be too recent. We postulate that there could have been several rotavirus transmission events from cats to humans; however, older strains might have been wiped out by evolutionary constraints, and the currently circulating strains that evolved from a common ancestor in circa 1864 which could have survived and dispersed in different places with further local evolution. Because of increased human-animal interaction in recent decades it is possible that the current strains might evolve further and give rise to virulent strains of rotavirus.
Human-animal interaction has increased in recent years for several reasons, such as more humans having pets, a loss of animal habitats because of deforestation, and increase in large-scale farming and ecotourism. As a result, the potential for zoonotic transmission of viruses has increased, which could lay the foundation for the emergence of reassorted strains. The development of a common P [9] vaccine for humans and cats might help control rotavirus infection by this reassorted strain.
We conclude that the VP4 gene of the available P [9] strains could be divided into six lineages. VP4 gene as short as 837 nt were adequate for lineage designation. Although humans have had interactions with cats for thousands of years, the common ancestor of the current P [9] strain is relatively recent. We also found that none of the lineages has a speci c region of distribution.

Limitations
Further study is needed particularly using nucleotide sequences of P [9] strains from cats to evaluate the time-line of evolution of P [9] strains.
Many of the genes in your analysis were sequenced from strains that have undergone passage in cell culture which might had impact on the analyses.