MIP analyses of 2013-2014 samples
Following MIPWrangler processing, a 250 bp paired end MiSeq run following a single
MIP capture yielded 9 million paired end reads and 4 million UMIs. Sequencing was
successful for 514/552 children. The geolocation data indicates that these 514 children
live throughout the DRC (Figure 1). Complete pfcrt SNP data was available for 513 children, and 307 had data available across all pfcrt and pfdhps loci of interest.
The results of THE REAL McCOIL analysis estimated an average complexity of infection
(COI) of 2 (range = 1-17). Of children with complete genotyping data, one-hundred
and eight (35% of the total) had polyclonal infections, compared with 20% of infections
that were polyclonal in 2007 (X2 = 7.28, df = 1, p < 0.01). However, this is likely an underestimate of the true number
of polyclonal infections as we are only looking at three loci.
Frequency of pfdhps and pfcrt variants over time
The overall proportion of pfdhps mutations remained relatively steady from 2007 to 2013, (80% [95% CI = 72-86%] vs
86% [95% CI = 83-89%], Figure 2). However, the proportions of K540E mutations increased significantly from 17% in
2007 (95% CI = 11-24%) in 2007 to 41% (95% CI = 36-47%) in 2013 (X2 = 25.57, df = 1, p<0.01). A581G mutations also increased significantly between years,
from 3% (95% CI = 1-8%) in 2007 to 18% (95% CI = 14-23%) in 2013 (X2 = 15.27, df = 1, p <0.01). Only one individual in 2007 had a single A581G mutation,
in all other cases, in both years, A581G was only found in the presence of a K540E
mutation. Thus, the proportion of double K540E/A581G mutants also increased significantly
across years, from 2% (95% CI = 1-7%) in 2007 to 18% (95% CI = 14-23%) in 2013 (X2 = 19.27, df = 1, p <0.001).
Amongst monoclonal infections, there were similar patterns of allele frequencies over
time. The proportions of infections carrying any of the three pfdhps SNPs increased slightly; 62% (95% CI = 51-73%) in 2007 versus 73% (95% CI = 66-79%)
in 2013 (X2 = 2.71, df = 1, p = 0.10). However, the proportion of double K540E and A581G mutant
parasites increased from 4% (95% CI = 1-8%) in 2007 to 12% (95% CI = 7-17%) in 2013
(X2 = 3.03, df = 1, p = 0.08).
The proportion of pfcrt CVIET haplotypes did not change significantly from 2007 (58% [95% CI = 50-65%] to
2013 (54% [95% CI = 49-58%]; X2 = 0.80, df = 1, p = 0.37). No parasites harbored the SVMNT haplotype. Among monoclonal
infections, the proportion of pfcrt CVIET haplotypes also remained steady; 55% (95% CI = 46-63%) in 2007 and 56% (95%
CI = 51-61%) in 2013 (X2 = 0.012, df = 1, p = 0.91).
Risk factor analysis
Complete pfdhps and DHS covariate data were available for 492 individuals from both the 2007 and
2013-2014 studies; complete pfcrt and DHS covariate data was available for 675 individuals. Reported antimalarial use
was low, with a cluster average of only 12% of pregnant women receiving SP in 2007
and 24% in 2013. In 2007, an average of only 4% of children per cluster reporting
a cough or fever received amodiaquine, and only about 1% in 2013. A summary of the
cluster and individual level characteristics by pfdhps and pfcrt genotype is available in Table 1 (Supplemental Files).
The mixed-effects model identified several risk factors for pfdhps mutations and the pfcrt CVIET haplotype (Table 2 in Supplemental Files). Increasing cluster-level use of SP was a risk factor for carrying a K540E mutation
(PR = 1.14, 95% CI = 1.09 – 1.20, p <0.01) as was increasing cluster prevalence of
P. falciparum infections (PR = 1.11, 95% CI = 1.06 – 1.17, p = 0.02). The results from the pfcrt model indicated an inverse relationship between the prevalence of mutations and the
proportion of uneducated individuals (PR = 0.92, 95% CI = 0.90 – 0.95, p < 0.01).
Education may be a proxy for access to medications.
Results from the secondary univariate models matched those from the multivariate models
(Supplemental Table 2). Like the multivariate model, the univariate models did not identify any risk factors
for carrying any pfdhps mutation. The univariate models of K540E identified both increasing SP use and increasing
cluster P.f. prevalence as risk factors, though the p-value for prevalence was not significant
at the 5% level. Like the multivariate model, the univariate models of pfcrt identified only increasing cluster level education as a risk factor for the CVIET
haplotype. Similarly, increasing cluster level proportion of poor individuals showed
a protective effect against the CVIET haplotype, though the association had a p-value
that was not significant at the 5% level. Full results for the univariate models are
available in Supplemental Table 1.
Spatial-temporal prediction maps:
The prediction maps generated from the logistic Gaussian model indicate that the allele
frequency distribution of the A437G mutation shifted range slightly between 2007 and
2013, decreasing in the east and west of the country but increasing in the south (Figure 3). The results also demonstrate the geographic spread of both the K540E and A581G
mutations from east to west, showing both an increase in the frequency of each mutation
and a geographic expansion, indicated by the shift in the 10% contour lines (marked
in black). Pfcrt results demonstrate that there has been no significant change in the spatial distribution
of the CVIET haplotype between 2007 and 2013; the prevalence of the haplotype is highest
across the central part of DRC. The wide 95% credible intervals on posterior parameter
weights indicate that there is large uncertainty as to which components are driving
the signal (Supplementary Figure 1). Similarly, the posterior error maps show that there is large uncertainty in the
predicted allele frequency at most points in space (Supplementary Figure 2). Hence, it is important to recognize that the maps in Figure 3 show only the average prediction, and there are alternative maps that are plausible
under the posterior distribution. However the general patterns described above, such
as the east-west expansion of K540E and A581G mutations, remain consistent over the
majority of posterior draws, and therefore are well-supported in spite of uncertainty
in any specific prediction.