Ethical approval for the study was obtained from the University of Pretoria, Faculty of Health Sciences Research Ethics Committee (Ethics Reference No. 406-2014). Patient information was extracted from the Limpopo Malaria Case (LMC) database based on the clinics from which rapid diagnostic test (RDT) samples were collected, after receiving ethical approval from the Limpopo Department of Health (Ref: LP_201906_011).
Study site & study design
Study samples were randomly collected from seven primary health clinics within five known source health districts (Nzhelele/Tshipise, Thohoyandou, Mutale, Elim, and Levubu/Shingwedzi) within the Vhembe District, Limpopo Province, South Africa (Fig. 1). Out of a total of 1892 confirmed cases that occurred during the entire study period and reported only at the 7 study clinics, a total of 1153 (61%) RDT samples from symptomatic patients that tested positive for malaria on P. falciparum-specific RDTs (First Response® Malaria Antigen P. falciparum card test HRP2, Premier Medical Corporation, India) were randomly collected from the seven different clinics in the wet and dry seasons from January to December of the years 2016, 2017 and 2018. Samples were transported and stored at room temperature in sealed bags with desiccant. Demographic and travel history information on each of the individual patients was obtained from the LMC database. After linking genotyped data with patient information from the LCM database, samples were stratified based on the source of infection health district (Fig. 1) as it was established that patients that acquired their infections from similar source areas did not necessarily visit the same clinics for testing and treatment. Therefore, it was concluded that source health districts would be more informative for geospatial analysis than clinics. Forty five percent (45%) of the samples however, could not be linked to source health districts and were therefore classified as "unknown". Source of infection could not be established not because information was not available on the LCM database, but because the genotyped samples could not be linked back to patient identity as the only unique identifiers that could link the patient RDT to their information on the database were either missing or illegible on the RDT. The study was started off with blinded patient identities with only knowledge of year of infection and clinic visited by patients from where samples were then collected.
DNA isolation and quantification
DNA was extracted from the nitrocellulose strip of First Response™ RDTs in accordance with the Worldwide Antimalarial Resistance Network Guidelines . The Saponin-Chelex method  was used for all DNA extractions and extracted genomic DNA stored at -20°C. Isolates were screened for P. falciparum using ultra-sensitive varATS  and TARE-2 [29, 30] quantitative PCR (qPCR) as described before. Parasite density was quantified on a random subset of 353 high volume samples from all years. It was not possible to quantify all samples due to a low DNA volume for a proportion of samples. The genotyping threshold was set at a parasite concentration of ≥ 10 parasites/μL of blood.
A total of 1153 samples were genotyped using a previously described 26 microsatellite marker panel protocol [25, 26, 31]. Briefly, the 26 microsatellite loci were amplified using a semi-nested PCR protocol. The primary PCR was performed in 4 groups of multiplex reactions, and 1 µL of the primary amplified product used as a template for the secondary individual PCR for each marker. To determine repeat length sizes, the labelled PCR products were diluted and sized by denaturing capillary electrophoresis on an ABI 3730XL analyser using GeneScan™ 400HD ROX™ Size Standard (Thermo Fisher Scientific). MicroSPAT software (https://github.com/Greenhouse-Lab/MicroSPAT/releases/tag/v2.0.3) was then used to automate identification of true alleles and differentiate real peaks from artefacts of the resulting electropherograms using a classifier algorithm based on the location and size of locus-specific patterns relative to a primary peak as done in studies conducted in Eswatini , Namibia , the KZN Province of South Africa  and China  that used similar experimental conditions as those used in his study. Multiple alleles per locus were scored if minor peaks were at least a third of the height of the major peak. Collated genotyping data from all samples was processed with similar microSPAT software settings in which a semi-supervised naïve Bayes classifier was used  to avoid variability in allele calling. All samples that met the genotyping threshold were genotyped and had to have a genotyping coverage ≥ 60% (alleles detected on at least 15 or more loci) to be included in the downstream population genetics analysis.
Characterisation of within-host genetic diversity
The within-host diversity of infections was determined using multiplicity of infection (MOI, or genetically distinct P. falciparum clones) and the within-host fixation index (FWS). To reduce the probability false positive alleles influencing MOI, the MOI in an individual sample was defined as the second highest number of alleles detected at any of the 26 microsatellite loci genotyped. Since the P. falciparum parasite is haploid in the human host, multiple peaks or alleles correspond to an infection with multiple genotypes or strains (a polygenomic or polyclonal infection).
The other measure of the within-host diversity, the FWS index, is a measure of diversity in an individual infection relative to the population level genetic diversity and was determined as previously described [25, 26, 32]. A low FWS indicates high within-host diversity relative to the population thereby suggesting higher chances of outbreeding. FWS was calculated for each sample using the equation: , where Hw is the allele frequency of each unique allele found at a particular locus for each individual and Hs is the heterozygosity of the local parasite population. Outbreeding is reported as 1-FWS with a 0 value indicative of a perfect clone and therefore low within-host diversity. Previously described thresholds of Fws ≥ 0.95 (1-Fws ≤ 0.05) were used to identify samples containing a single genotype (or "clonal" infections) in spite of additional genotypes that may be present at relatively low proportions; and Fws ≤ 0.70 (1-Fws ≥ 0.30) to describe samples with highly diverse infections respectively [25, 32-34].
Characterization of population level genetic diversity
Population level genetic diversity was estimated using expected heterozygosity (He) which is defined as the probability of randomly drawing a pair of different alleles from the allele pool. Heterozygosity was, therefore, calculated on each locus using the equation:
where n is the number of genotyped samples and pi is the frequency of the ith allele in the population [25, 26]. Values for He range from 0 indicating no diversity, to 1 indicating that all alleles are different and therefore there is maximum diversity. The mean He was calculated by taking the average He across all loci. The number of haplotypes (unique multilocus genotypes) as well as the unique alleles detected per locus (A - allelic richness), were also determined.
To assess whether alleles from different loci were associated with each other, the multilocus linkage disequilibrium (LD) was determined as previously described  using the Poppr package in R software . LD was determined for the whole dataset which includes monoclonal and polyclonal infections, with LD for monoclonal infections alone determined as a precaution against the bias that may result from presence of any false dominant haplotypes . The Monte Carlo method was used to test the significance of LD in the complete data set of each population stratified by geographic origin of the infection (source health district). In this study, 10 000 permutations were completed. LD values range from 0 (no loci in LD) to 1 (all loci in LD). Pairwise standardized index of association (ISA) over all loci was assessed to determine whether the observed pattern of LD is due to a single or multiple pairs of loci.
Geospatial population substructure and genetic differentiation
To determine the influence of geographic origin of infections on genetic diversity, the ANOVA pairwise t-test was used to compare MOI, 1-FWS and He between the parasite populations stratified by source health district. Population substructure between the geographic areas was investigated by measuring Wright’s F-statistics (FST), using the adegenet package  in R. Hendrick's GST and Jost's D, were calculated using the mmod package  in R. Genetic differentiation between populations ranges from 0 to 1 representing absence of to complete differentiation, respectively. The Monte Carlo method was used to test the significance of pairwise FST between source health districts. In this study, 999 permutations were completed. Additionally, discriminant analysis of principal components (DAPC) using the adegenet package in R software was used to confirm signatures of population structure [37, 39]. DAPC infers population structure based on whether haplotypes (estimated from multi-locus genotypes generated from all major and minor allele data) clustered into distinct genetic populations. Unlike the traditional principal component analysis (PCA) which identifies linear axes that explain the most variability in all groups together, DAPC seeks to detect the linear axes which explain the most between-group variability in data . K-means clustering was used to detect the number of inferred genetic clusters in the parasite population, and the best number of clusters chosen was that with the lowest associated Bayesian Information Criterion (BIC). To prevent overfitting of clusters, the optimal number of principal components (PC) to be retained was confirmed by cross validation of the DAPC. Data was divided into a training set (90% of data), and a validation set (10% of data), and members of each of the identified clusters were stratified by random sampling to ensure that at least one member of each cluster is represented in both training and validation sets. DAPC was then performed on the training set with variable numbers of PCs retained. The extent to which the analysis was able to accurately predict group memberships of individuals in the validation set was used to identify the optimal number of PCs to be retained. Sampling and DAPC procedures were repeated 1000 times at each level of PC retention, and the optimal number of PCs retained was associated with the lowest root mean square error. The resultant clusters were then plotted in a scatterplot of the first and second linear discriminants of DAPC.
Characterising pairwise genetic relatedness between infections
To determine the genetic connectivity/relatedness of pairs of infections including all alleles detected in both monoclonal and polyclonal infections of successfully genotyped samples, a modified identity by state (IBS) metric was used . Overall, pairwise IBS was calculated as the average of locus specific estimates under the assumption of independent loci. This metric was calculated using the formula:
where n is the number of genotyped loci; Si is the total number of shared alleles at locus i between samples X; and YiXi is the number of alleles in sample X at locus I; and Yi is the number of alleles in sample Y at locus i. A total of 85078 pairs of infection within the Vhembe District dataset were analysed and highly related infection pairs above the cut-off of IBS ≥ 0.5  identified. Pairwise comparisons of relatedness of parasite pairs were then grouped into two categories depending on whether they occurred between two parasites isolated from individuals who acquired infections in the same (within) source health district or between two parasites from individuals who acquired infections from different (between) source health districts respectively. ANOVA pairwise t-test was used to compare differences in the proportions of highly related infections within and between the groups.
Influence of temporal variation on genetic diversity
To determine the influence of temporal changes in transmission on genetic diversity, the ANOVA pairwise t-test was used to compare MOI, 1-FWS and He between the parasite populations stratified by year of transmission. Finer scale stratification by month of infection was also assessed to determine how MOI changes in the dry and wet transmission seasons and pairwise t-tests were used to compare MOI between the different seasons.
Assessing the impact of control interventions on the complexity of infections
The impact of the main strategy for control, indoor residual spraying (IRS) on the within-host diversity of P. falciparum was also assessed based on the number of unique genotypes (MOI) identified in sprayed vs. unsprayed households. IRS in the Limpopo Province typically takes place at the beginning of each malaria season (wet season between September and May), with the number of households to be sprayed determined by the provincial malaria control programme based on factors such as the number of structures within the malaria endemic area, insecticide availability and resistance data .