The University of Limpopo (MEDUNSA campus) (now called Sefako Makgatho Health Sciences University) Research & Ethics Committee approved the study (MREC/P/237/2014).
The diarrheal stool samples were collected as a routine diagnostic clinical specimen when the parents brought their child to a health facility for clinical management, requiring no written informed consent. As part of the WHO-coordinated rotavirus surveillance network, the archived rotavirus-positive specimens, were anonymized and utilized for strain characterization under a Technical Service agreement and a Materials Transfer Agreement to the WHO AFRO Regional Reference Laboratory based at Sefako Makgatho Health Services University. The WHO Research Ethics Review Committee granted an ‘exemption activity’, noting that the procedures involved in the study are part of routine hospital-based rotavirus surveillance.
The stool samples were collected from children presenting with diarrhoea during the 2010-2012 and 2014 rotavirus surveillance periods from six African countries (Ethiopia (ETH), Kenya (KEN), Rwanda (RWA), Tanzania (TZA), Togo (TGO), and Zambia (ZMB)). A standardised WHO generic protocol for hospital–based rotavirus surveillance was followed to recruit eligible children and collect the stool samples, as described elsewhere [14,15]. The samples were available at the Diarrhoeal Pathogens Research Unit (DPRU), a WHO Rotavirus Regional Reference Laboratory for rotavirus strain characterization based at Sefako Makgatho Health Sciences University. The samples were stored at -20°C until retrieved for this analysis. Fifteen rotavirus strains previously recorded as G12 by conventional genotyping methods  were selected for further analysis in this study. Table 1 lists the characteristics of the 15 selected G12 strains analysed in this study.
Viral dsRNA extraction, VP4 and VP7 genotyping:
Viral dsRNA was extracted using QIAamp® viral RNA extraction kit (Qiagen, Hilden, Germany) as per manufacturer’s instructions. The extracted dsRNA was subjected to reverse transcription polymerase chain reaction (RT-PCR) to amplify VP4 (partial VP8*) and VP7 genes using consensus primers sets Con2/Con3 and sBeg/End9, respectively [15,22,27]. Furthermore, to confirm the samples as G12 rotavirus strains, samples were genotyped using a cocktail of primers consisting of RVG9 and aBT1, aCT2, mG3, aDT4, aAT8v, mG9, mG10, newG12, representing G1, G2, G3, G4, G8, G9, G10 and G12 genotypes [37,38]. The VP4 gene cocktail of primers which amplifies VP8* consisted of Con3 and 1T-1D, 2T-1, 3T-1, 4T-1 and 4943 representing human rotavirus genotypes P, P, P, P and P [27,37]. The sequences of primers used in this study are shown in Supplementary Table 1. The PCR conditions were set out as described elsewhere [22,37,41,42].
Amplicons were sequenced using the dideoxynucleotide termination Sanger sequencing method with ABI 3500XL sequencer. A region of VP7 and VP4 was sequenced using reverse and forward primers used for RT-PCR. The sequence chromatograms were edited using chromasPro version 1.49 beta resulting in 981 bp located at position 1 – 981 of the VP7 gene and approximately 731 bp located from position 97-827 of the VP4 gene fragments (www.technelysium. com.au/chromas.html).
Sequencing data was then compared with available rotavirus sequences in the GenBank database using the NCBI-BLAST software (www.ncbi.nlm.nih.gov/BLAST/, USA). The VP7 and VP4 alignments were made using the MUSCLE algorithm implemented in MEGA 6 software [43,44]. To expand the analysis, VP7 G12 and VP4 P sequences from other African countries available in the GenBank were downloaded and included in the alignments. Once aligned, the DNA Model Test program implemented in MEGA version 6 was used to identify optimal evolutionary models that best fit sequence datasets. Using the Corrected Akaike Information Criterion (AICc) the following models; GTR+G (VP7), T92+G (VP4 P) and GTR+G+I (VP4 P) were utilized. Using these models, maximum-likelihood trees were constructed using MEGA 6 with 1000 bootstrap replicates to estimate branch support. Nucleotide and amino acid sequence identities among strains were calculated for each gene based on distance matrices prepared using the p-distance algorithm in MEGA 6 software . Dot conservation plots were constructed using BioEdit sequence alignment editor  identifying the variable and antigenic regions within the VP7 gene [46,47] of the study strains with G12 reference strains belonging to the four G12 lineages (I-IV).
Simultaneously, P VP4 sequences of the study strains were compared with P of both the Rotarix and RotaTeq vaccine strains and other recent circulating strains; while the P VP4 sequences were analysed by comparison with other globally circulating P reference strains.
To estimate the rate of evolution (substitutions per site per year) and the time of the most recent common ancestor of the G12 genotype, 114 G12 VP7 sequences isolated between 1987 and 2019, representing the temporal span of these genotypes from the first G12 isolate to contemporary strains, and spanning global distribution of these strains, were retrieved from GenBank together with our study strains. To investigate the temporal signal of the sequences and to remove sequences that might be diverse, the G12 maximum likelihood phylogenetic tree was analysed in TempEST v1.5.3 , a tool that assesses the association of root-to-tip divergence and sampling of each sequence. Finally, Bayesian Markov chain Monte Carlo (MCMC) analysis was performed in BEAST v.1.6 software package, (http://beast/bio/ed.ac.th). Several models with different priors were initially tested and compared using Bayes factor. Then the following Bayesian parameters were set out - GTR+G substitution model, uncorrelated exponential relaxed clock lognormal model  and coalescent Bayesian skyline tree prior . This analysis was run four times at 50 million generations. The individual runs were combined with LogCombiner and Tracer v.1 (http://tree.bio.ed.ac.uk/software/tracer/) was used to view the results and effective sampling size (ESS) values of >200 indicated sufficient sampling. Maximum clade credibility trees were annotated using TreeAnnotator v.1.6.2 and visualized in FigTree v.1.4.3 (http://tree.bio.ed.ac.uk/software/ﬁgtree/).
The partial VP7 and VP4 sequences have been made available on the NCBI GenBank database (Accession numbers: MK059426 - MK059453; MT995937-MT995938).