HMPV prevalence in paediatric hospital admissions. Between January 2007 and December 2016, 9079 individuals below 60 months of age were admitted to the paediatric wards at KCH with severe or very severe pneumonia. Samples were collected from 6756 (74%) individuals and 274 (4.1%) were determined HMPV positive (Table 1). Decreased HMPV prevalence was recorded in more recent years i.e. from 2014 to 2016 (Table 1). Among the 274 HMPV positive cases, 43.1 % (n=118) were in children <6 months of age and 71.5% (n=196) in children <2 year of age. Children 3 years accounted for only 3.6% of the cases (n=11) (Table 2).
HMPV genotype assignment and temporal occurrence of HMPV in Kilifi. Phylogenetic analysis of Kilifi and reference F and G gene sequences showed clustering of Kilifi HMPV viruses into the two known HMPV subtypes, A (106/205, 51.7%) and B (99/205, 48.3%), Figure 1. All the subtype A sequences clustered within genotype A2 and further into sub-lineage A2b (84/106, 79.2%) and the unique sub-lineage A2c (22/106, 20.8%). No A1 or A2a sequences were observed. Within subtype B, 48.5% (48/99) of the sequences clustered within genotype B1 and 51.5% (51/99) of the sequences clustered within genotype B2. During the 10-year period of surveillance described here, HMPV infections occurred in a seasonal pattern, mainly between the month of October of one year and April of the following year, Figure 2, which was used to define an epidemic year, i.e. starting September of one year through August of the following year. Thus, subsequent analyses were done by epidemic year. Multiple genotypes co-circulated within a single epidemic year from 2007-2016, Figure 3b. There was a shift in dominance from A2b (2007 to 2011) to B1 (2012 to 2014) with subsequent rise of the sub-lineage A2c in recent years. Notably, B2 genotype remained persistent but non-dominant. A2b sub- lineage circulated only until February 2012.
HMPV genotype prevalence and dominance in Kilifi was compared to other countries around the world. This was based on analysis of 714 F gene sequences from 18 different countries and 500 G gene sequences from 15 countries (additional figure S1). Similar patterns were observed for Kilifi and global data sets, with predominance of A2b and B2 in 2007-2010 and subsequent rise of B1 and A2c in years that followed, Figure 3. Notably, in both Kilifi and other settings, B2 viruses persisted over the 9 years. A1 and A2a were the least commonly detected genotypes globally and not observed in Kilifi.
Phylogenetic analysis of Kilifi HMPV viruses. Based on maximum likelihood phylogenies of F and G genes, sequences from consecutive epidemics largely clustered together. However, across multiple epidemics sequences from the earlier years clustered together away from those in recent years, Figure 4. Further analysis of G gene sequences aligned temporally, revealed circulation of multiple variants of a single genotype in a single epidemic year, Figure 5. Each variant differed by at least five nucleotides (Figure5) and clustered into separate clades, Figure 4b. In comparison to contemporaneous viruses (714 F gene sequences and 500 G gene sequences) (additional figure S1), viruses from Kilifi and those from other locations were highly similar, with an average nucleotide identity of 97.1% and 95.6% (in F gene), and 90.4% and 85.9% (in G gene) for subgroups A and B, respectively. In phylogenies of F and G, combining contemporaneous and Kilifi sequence data, the Kilifi sequences did not cluster separately from others but interspersed with sequences from Canada, Spain, Malaysia, Japan, USA, India and Croatia (Additional figure S2), and similarly, those viruses collected from earlier years in Kilifi clustered away from those in recent years.
F gene sequence analysis. Due to partial gene sequencing (345bp) of the previously reported F gene Kilifi sequences retrieved from GenBank (41), combined Kilifi sequences were trimmed to 345bp. The F gene sequences were highly conserved both at the nucleotide (nt) and amino acid (aa) level with overall mean sequence identity of 89% (nt) and 97% (aa). Within the individual genotypes, the nucleotide and amino acid sequences were more conserved: there was 100% aa sequence similarity within sub-lineages A2b, A2c and genotype B1, and 99.9% within B2 genotype, and the mean nucleotide identities were 98.2% within B1, 99.2% within sub-lineage A2c, 98.3% within both sub-lineage A2b and B2 genotype. The evolutionary rates for F gene were estimated at 1.220× 10-3 (95% highest posterior density (HPD) = 1.317× 10-3 to 5.518× 10-3 ), 1.857× 10-3 (95% HPD = 1.111× 10-3 to 2.289× 10-3 ), 1.242× 10-3 (95%HPD = 8.303× 10-4 to 1.689× 10-3) and 1.050× 10-3 (95%HPD = 5.937× 10-4 to 1.600× 10-3), nucleotide substitutions/site/year for A2c, A2b, B1 and B2, respectively. Selection pressure analysis of the F gene showed overall mean dN/dS ratio of ω= 0.10, ω= 0.15, ω= 0.01 and ω= 0.12 for A2c, A2b, B1 and B2, respectively (Additional figure S3). Using M2a and M3 models, no codon sites were identified as positively selected among the four types (sub-lineage A2b, sub-lineage A2c, B1 and B2 genotypes). (Additional figure S3).
G gene sequence analysis (Kilifi). G gene sequences were less conserved compared to F gene sequences with overall mean sequence identity of 73% (nt) and 56% (aa). However, within genotypes the sequences were highly conserved: the mean identities were estimated at 97% (nt) and 96% (aa) for sub-lineage A2c, 93% (nt) and 89% (aa) for sub-lineage A2b, 94% (nt) and 90% (aa) for B1, 92%(nt) and 87% for B2 genotypes. The deduced G gene amino acid sequences differed in lengths among genotypes: sub-lineage A2c (220aa), sub-lineage A2b (218 and 229aa), B2 (238aa) and B1 (232 and 242aa) long. The varying protein lengths for sub-lineage A2b and B1 genotype were due to changes in stop codon positions. The evolutionary rates for the G gene were estimated at 3.420× 10-3 (95% HPD interval = 1.317× 10-3 to 5.518× 10-3), 8.022× 10-3 (95% HPD = 5.893× 10-3 to 1.060× 10-2) and 9.570× 10-3 (95% HPD = 5.911× 10-3 to 1.33× 10-2) nucleotide substitutions/site/year for A2c, A2b and B1, respectively. Evolutionary rate estimates for B2 genotypes were not determined due to inadequate temporal signal. Among all the genotypes, sub-lineage A2c exhibited significantly lower substitution rates within the G gene (95% HPD = 1.317× 10-3 to 5.518× 10-3). The estimated times of the most common ancestor of the individual genotypes dated to recent origin in both F and G genes, i.e., 2009 (95% HPD = 2005 -2011) for sub-lineage A2c sequences, 2008 (95% HPD = 2006 -2009) for sub-lineage A2b, 2005 (95% HPD = 2002 -2007) for B1 and 2001(95% HPD = 1997-2005) for B2 sequences. The estimated mean dN/dS ratio (ω) was <1 for A2b (ω= 0.57), A2c (ω= 0.38) and B1(ω= 0.78). For the genotype B2, a higher average dN/dS ratio (ω= 1.21) was determined for G gene, indicating a signal of positive selection (Additional figure S3). Using M2a and M3 models, 3 positively selected sites were identified within A2c (17, 24 and 28) and A2b (87, 123 and 135), 1 site (96) in B1 and 2 sites (12,127) within B2 sequences (Additional figure S3).
Genotype changes and Disease severity. We did not observe a statistically significant association between HMPV genotype and pneumonia severity (p-value=0.264) (Table 3).