Genomic malaria surveillance of antenatal care users detects reduced transmission following elimination interventions in Mozambique

Routine sampling of pregnant women at first antenatal care (ANC) visits could make Plasmodium falciparum genomic surveillance more cost-efficient and convenient in sub-Saharan Africa. We compared the genetic structure of parasite populations sampled from 289 first ANC attendees and 93 children from the community in Mozambique between 2015 and 2019. Samples were amplicon sequenced targeting 165 microhaplotypes and 15 drug resistance genes. Metrics of genetic diversity and relatedness, as well as the prevalence of drug resistance markers, were consistent between the two populations. In an area targeted for elimination, intra-host genetic diversity declined in both populations (p=0.002–0.007), while for the ANC population, population genetic diversity was also lower (p=0.0004), and genetic relatedness between infections were higher (p=0.002) than control areas, indicating a recent reduction in the parasite population size. These results highlight the added value of genomic surveillance at ANC clinics to inform about changes in transmission beyond epidemiological data.


Introduction
2][3] Examples include more accurate data on emergence and spread of drug and diagnostic resistance, 4,5 inferring parasite connectivity to support the classi cation of imported cases, 6 and predicting vaccine effectiveness. 7][10][11][12] This could be especially useful for strati cation and evaluating the effectiveness of anti-malarial interventions.
For continuous genomic surveillance of malaria, samples must be collected regularly, and, especially critical for lowresource settings, cost-e ciently. 2,13,14Pregnant women attending their rst antenatal care (ANC) consultation are an easy-access subpopulation that could potentially serve as a sentinel group for malaria surveillance. 13,15,16esides low cost and easy accessibility, advantages of ANC-based surveillance include temporal continuity, known denominator populations, and the possibility of capturing asymptomatic infections. 15Malaria burden trends in pregnant women at their rst ANC visit have been shown to mirror community trends, 17 and routine malaria testing at ANC has already been implemented in Tanzania, where it is generally perceived as acceptable and positive by both patients and providers. 18A few small studies, mostly outside of Africa, have investigated malaria genetic diversity in pregnant women using whole genome sequencing 19 , microsatellite markers 20,21 or nested polymerase chain reaction (PCR) 22,23 .However, routinely collected genomic data from ANC has not been evaluated for its suitability for sentinel surveillance.
We hypothesized that the Plasmodium (P.) falciparum parasite population circulating in pregnant women at their rst ANC visit and in the community are genetically similar, including similar genetic diversity (intra-host and population-level), relatedness between infections, and prevalence of antimalarial resistance markers.To test our hypothesis, we analyzed the parasite population in ANC users in southern Mozambique, and compared it to parasites found in children aged 2-10 years sampled in household surveys.Furthermore, we compared the parasite populations in three areas with declining transmission between 2015 and 2018.Manhiça and Magude were lowtransmission areas, with Magude recently targeted for elimination with a package interventions, 24 while Ilha Josina is a historically high-transmission setting. 17

Sequencing performance
A total of 558 P. falciparum-positive dried blood spot (DBS) samples from ANC users (n=378) and children sampled in population-representative household surveys (n=180) were attempted sequenced.241 amplicons were targeted, including 165 microhaplotypes informative about genetic diversity in the parasite population.68.5% (382/558) of the samples were successfully sequenced and passed the ltering criteria (Table 1).Sequencing performance, i.e., total number of reads and number of loci covered per sample (n=558), was primarily a function of parasite density (Fig. 1a,b).Across all samples attempted sequenced, parasite densities were lower in those from children than those from ANC, and a lower proportion of samples from children passed ltering (51.7% compared to 76.5% from ANC).
Parasite densities were similar between populations among successfully sequenced samples (geometric mean [GM]=191 and GM=154 parasites/µL, respectively), and among samples that were ltered out (GM=7 parasites/µL for both, Fig. 1c).Sequencing coverage was high across included samples (n=328), with a geometric mean total reads per sample of 453,541 and a median 208 loci covered (out of 241 in total) per sample.On average, each locus (n=241) was covered by 1.4 million reads and 462 samples (Fig. 1d-e).

Intra-host genetic diversity
Half of the pregnant women attending ANC consultations carried polyclonal infections (Table 2).On average, ANC attendees had a multiplicity of infection (MOI) of 2.4, i.e., carried 2.4 genetically different P. falciparum parasite clones.Effective multiplicity of infection (eMOI), which incorporates intra-host relatedness between clones, was lower at 1.8, while 1-Wright's inbreeding coe cient (1-F ws ) was 0.39, both indicative of inbreeding.Parasite density was associated with measured intra-host diversity, with higher diversity observed for women with higher-density infections.eMOI showed an overall declining trend from 2017 to 2018-2019, and was highest in Magude.1-F ws showed trends but did not reach statistical signi cance.Primigravid women had higher eMOI compared to multigravidae in the univariate analysis, but the effect disappeared when adjusting for parasitemia, time, and area (Supplementary Table 1).No statistically signi cant differences were observed between seasons or human immunode cient virus (HIV)-status groups.Among children, 62.4% carried polyclonal infections, the average eMOI was 2.3, and 1-F ws was 0.55.Similar to ANC users, children with higher-density infections showed higher eMOI.
Temporal trends in intra-host genetic diversity A signi cant interaction was observed between area and time in the multivariate analysis of intra-host diversity at ANC, indicating different temporal trends within the three areas.Parasite densities did not change over time (Supplementary Fig. 1).In Magude, eMOI declined by 50% per year (95%CI: -0.78;-0.25,p<0.0002,Fig. 2a-c, Supplementary Table 2), with a shift toward more infections having eMOI>2 (Supplementary Fig. 2), while 1-F ws and odds of infections being polyclonal showed declining trends (58% and 46% yearly decline, respectively).No temporal changes in intra-host diversity were observed in Manhiça, while in Ilha Josina, there was an increasing trend over time in polyclonal infections.Intra-host diversity among 47 children from Magude sampled cross-sectionally were compared with samples from ANC users in Magude (Fig. 2a-c Magude panel and d-f, Supplementary Table 3).In multivariate regressions combining both populations, all metrics of intra-host diversity showed declining trends over time.Both populations showed very signi cant declines in eMOI (-36% and -50% per year for children and ANC attendees, respectively), and eMOI was not associated with population group (p=0.21).1-F ws and odds of having a polyclonal infection also tended to decline in both population groups, and no effect of population was detected.
Comparing H E between ANC populations in the three areas, parasites in Magude showed less diversity than the parasite population in Ilha Josina (Fig. 3e,f, Supplementary Table 4).In order to compare H E between ANC users and children, the ANC population was randomly subsampled within strata of areas and years to match the community population (n=33 samples from each population).Mean H E did not differ between populations when accounting for locus-to-locus variability (p=0.95,Fig. 3g,h, Supplementary Table 5).

Pairwise inter-host genetic relatedness
Genetic relatedness between pairs of P. falciparum infections from ANC users was estimated, including polyclonal infections (n=83,521 pairs).ANC infections generally showed low relatedness, with a mean pairwise identity-bydescent (IBD) of 0.026 (95%CI: 0.022;0.033).IBD was slightly but signi cantly higher between infections in Magude compared to within and between other areas (Supplementary Fig. 3a, Supplementary Table 6).Infections in children tended to be more related compared to infections in ANC attendees, and between the two populations.Restricting the comparison to samples from overlapping years (2017-2020) and temporal windows (April 15 to June 30), mean IBD between ANC infections was 0.018, similar to the mean IBD of 0.017 observed for infections from children (Supplementary Fig. 3b, Supplementary Table 7)

Markers of drug resistance
Prevalence of all markers of antimalarial resistance researched in this study was similar between ANC users and children from the community (Table 3).Parasites with quintuple 51-59-108-437-540 mutations in the dihydrofolate reductase and dihydropteroate synthetase (pfdhfr-pfdhps) genes were highly prevalent in both populations (>90%).In particular, sulphadoxine-pyrimethamine (SP) resistance-associated polymorphisms in the pfdhfr gene had almost reached xation in the population, with 98.6% carrying the triple 51-59-108 mutant.No A581G nor I431V mutations in pfdhps were detected.Three quarters of the study population carried a multidrug resistance 1 (pfmdr1) F184Y gene mutation associated with amodiaquine resistance, while 1.2% carried the N86Y, and 0.3% carried the D1246Y mutations.The chloroquine resistance transporter (pfcrt) 72-76 CVIET mutant genotype was observed in four individuals, three of them children.No mutations in the kelch 13 propeller gene, pfkelch13, associated with artemisinin resistance, was observed in either population.

Discussion
This study applied a multiplexed amplicon sequencing approach targeting microhaplotypes and drug resistance markers to assess the representability of pregnant women attending their rst ANC consultation for sentinel P. falciparum genomic surveillance.We found that genetic diversity and pairwise inter-host relatedness, as well as prevalence of drug resistance markers, were consistent between rst ANC users and children aged 2-10 years, representing the community.In Magude, which was subject to an eliminating campaign, similar declining trends in intra-host diversity were observed for both ANC users and children.Our ndings demonstrate the potential of ANCbased malaria genomics as a straight-forward and cost-e cient approach to assess the impact of antimalarial interventions and genetic variants of public health concern.
Pregnant women seeking ANC have previously been shown to mirror trends in malaria prevalence in the general population, although with a delay, and with more heterogeneity between gravidity groups at higher transmission settings. 16,17A few studies have also compared the genetic diversity of parasite populations in pregnant women and the community, 19,20,22 but these were based on small sample sizes, only one took place in Africa, and, importantly, none accounted for parasite densities.With this study, we expand the potential scope of ANC-based surveillance to include genomic surveillance of P. falciparum genetic diversity and resistance markers.We nd that both primigravid and multigravid rst ANC users, regardless of HIV status, can be included in a sentinel population.Since no differences were observed between seasons, sampling could take place throughout the year.However, other studies did nd seasonal differences 8 , indicating that this might depend on the setting.Furthermore, it may not be realistic to reach su cient sample sizes at ANC facilities alone at very low transmission, and it would be necessary to combine ANC sampling with other sampling strategies, such as health facility surveys.ANC sampling would also not be ideal if the goal is to identify ner relatedness patterns, including transmission networks, because of the temporal sparsity of samples.Consistent with previous observations that parasite populations are at least partially structured in time, 25 relatedness was higher among cross-sectionally sampled children than among continuously sampled ANC users, with the difference disappearing when restricting the comparison to similar temporal windows.
Genetic diversity has been proposed as a surrogate marker of transmission intensity. 9,10,12In line with this and previous studies, 8,11 we found the highest population diversity in the highest-transmission setting, Ilha Josina.
Conversely, we also found the lowest intra-host diversity in Ilha Josina (both eMOI and 1-F ws ).This might be explained by importation of parasites to low-transmission Magude and Manhiça from areas with higher transmission.A study from nearby low-transmission Eswatini observed similarly high diversity, which was attributed to frequent importation. 26Genetic diversity on its own might, therefore, not always be a suitable proxy for local transmission intensity, and strati cation based on genetic metrics should be carefully validated against other epidemiological data, including assessing the potential role of importation.
The genetic indicators of reduced transmission observed within Magude (decline in eMOI and 1-F ws , increased mean IBD, and lower H E ) highlight how parasite genomics can complement clinical and epidemiological data to evaluate the impact of control interventions.Between 2015 and 2017, Magude was targeted with biannual rounds of mass drug administration (MDA), followed by reactive focal MDA in 2018, and three rounds of indoor residual spraying (IRS). 24Even though parasite positivity rates declined in all three areas during the study period, and at similar levels and rates in Magude and the control area Manhiça (from 6% to 2% and from 8% to 3%, respectively), 27 we only observed evidence of declining intra-host diversity in Magude.Furthermore, Magude showed signi cantly lower population diversity and higher mean IBD compared to the other areas.A study from Zambia found a similar reduction in the complexity of infections following an MDA trial. 28These ndings reveal programmatically important changes to the parasite population structure, i.e., recent reductions in the parasite population size, not apparent from prevalence and incidence estimates.
Strengths of this study include the rich data obtained from deep amplicon sequencing, with sensitivity to achieve good coverage for samples with down to 10 parasites/µL.Compared to single nucleotide polymorphism (SNP)based methods, microhaplotypes allow for higher resolution and consequently more accurate estimates of diversity and relatedness, while being more convenient than microsatellites. 9Furthermore, whereas data from SNPs are often restricted to monoclonal samples, the use of highly diverse markers and newly developed analytical tools allowed us to make full use of information from polyclonal samples 29 , which was half of the samples in this study.Another strength of this study is the large ANC sample size, collected prospectively across three years in three different transmission scenarios.To the best of our knowledge, this study represents the most comprehensive assessment of genetic diversity and relatedness of malaria infections among ANC users to date.This study is limited by the number of samples available to sequence from children, particularly when stratifying by site and year, restricting comparisons with ANC attendees.For intra-host diversity, we therefore focused on Magude, where most samples from children originated.We did not consider the potential issue of parasite importation from neighboring regions, nor reasons for ANC non-attendance, although we would not expect any potential selection bias 15 to affect the parasite population.To con rm the generalizability of this approach for routine surveillance, more studies should be carried out in different epidemiological settings and include larger community sample sizes.Finally, we observed a clear dependence of sequencing coverage on parasite density, which may be explained by technical limitations.When only few, if any, parasite genomes are present in DNA extracted from a DBS, it will be di cult to amplify the parasite DNA for sequencing.This limitation applies to all genotyping techniques, 9 and we reached comparably high sensitivity with the protocol applied here.The relationship between density and intra-host diversity may also be affected by biological processes, such as competitive stress and host immunity 30 , and future studies are needed to investigate this.Regardless of underlying causes, parasite density is an important confounding factor to adjust for when studying intra-host diversity.
In conclusion, this study extends the scope of ANC-based sentinel surveillance to include genomic malaria surveillance.We did not observe differences in genetic diversity, relatedness or resistance markers between P. falciparum collected from ANC users and children representing the community.In both ANC users and the community, we found genetic indicators of a recent reduction in the parasite population in an area targeted for elimination, demonstrating the added value of genomic data for impact evaluation.Multiplexed amplicon sequencing has great potential to support decision-makers with genomic intelligence, and adopting a cost-effective and convenient ANC-based sampling strategy would be a valuable step towards making genomic surveillance more feasible in malaria-endemic areas.

Study design and setting
This genomic surveillance study took place between 2015 and 2018 in three malaria-endemic areas in Maputo province in southern Mozambique.Transmission intensity ranged from low in Manhiça and Magude, to moderate-tohigh in Ilha Josina, and it declined in all three areas during the study. 17Magude district was subject to a package of interventions in 2015-2018 including MDA with dihydroartemisinin-piperaquine and IRS with dichlorodiphenyltrichloroethane and pirimiphos-methyl, resulting in a 85% reduction of in all-age positivity rates. 24(S methods p 3)

Study participants
Samples were collected from pregnant women at ANC clinics and children participating in household surveys.10,439 pregnant women were recruited when attending their rst ANC visit at Manhiça District Hospital, Ilha Josina Health Center, or Magude Health Center between November 2016 and November 2019.For 8,910 of the visits, informed consent to participate was obtained, and 8,745 visits were included in the study. 27The main reason for exclusion was not residing in the area.Women donated a nger-prick drop blood onto lter paper (dried blood spot), and HIV status, date, age, gravidity, area of residence, and recent movements were recorded.3,933 children aged 2-10 years were sampled for annual age-strati ed household surveys in the study area.The surveys were conducted around May every year (following the rainy season) from 2015 to 2019.DBS were obtained together with basic sociodemographic, clinical, and vector-control information. 24Self-reported gender was evenly represented in the surveys, with 50.3% girls and 49.2% boys (unavailable information for remaining 0.5%).

Ethics
All study protocols were approved by CISM's and Hospital Clínic of Barcelona's ethics committees, and the Mozambican Ministry of Health National Bioethics Committee.All study participants gave written informed consent, or in the case of minors, written informed assent and consent by a parent/guardian.

Amplicon sequencing
DNA was extracted from 558 available P. falciparum-positive DBS samples (from 378 ANC users and 180 children) using a Tween-Chelex based protocol (S methods p 3).A multiplex panel of PCR primers targeting 241 P. falciparum amplicons of 150-250 bp was developed (Paragon Genomics Inc, California, USA).Amplicons included 165 microhaplotypes informative about genomic diversity and relatedness in southern Africa 29,31 , and markers of drug resistance in 15 genes 4 , including polymorphisms associated with resistance to artemisinin (pfk13), SP (pfdhfr and pfdhps genes), and amodiaquine (pfcrt and pfmdr1 genes).We followed the manufacturers' instructions as in Tessema et al 2020 29 .DNA was ampli ed for 15 or 20 cycles for multiplexed PCR, depending on parasitemia and ability to amplify, and for 15 cycles for indexing PCR.A randomly selected subset of resulting libraries was assessed by capillary electrophoresis using a TapeStation (Agilent technologies, California, USA).Libraries were pooled accounting for differences in yield due to parasitemia, and the pool was bead-cleaned using CleanMag® Magnetic Beads at 1X ratio to remove primer dimers.Pooled libraries were run on an agarose gel from which the ampliconsized band was excised, and DNA extracted using Monarch® DNA Gel Extraction Kit (New England Biolabs Inc., Massachusetts, USA).Library pools were quanti ed and assessed using a TapeStation and a Qubit uorometer (S methods p 3).The puri ed libraries were sequenced in either a MiniSeq, or NextSeq instrument (Illumina, San Diego, USA).

Bioinformatics and data ltering
FASTQ les were run through a Next ow-based pipeline 32 (version 0.1.5),to infer alleles.Brie y, reads were demultiplexed for each locus using cutadapt 33 , and DADA2 34 was used to cluster reads using an error-inference model.Cutadapt and DADA2 were also used to lter and truncate reads based on quality and length.Finally, homopolymers and tandem repeats were masked.Subsequently, alleles with fewer reads than the maximum observed reads in any locus for negative controls (14 reads) were removed, along with alleles with <1% withinsample frequency.Samples with a coverage of <50 diversity loci with a read depth of 100 were ltered out.Finally, diversity loci with <100 samples covering them with a read depth of 100 were also removed.

De nitions
Rainy season was de ned as November 1 st to April 30 th , and the remaining year as dry season. 27Years were de ned based on transmission season, i.e., from November 1 st to October 31 st .When comparing time periods for ANC attendees, 2018 and 2019 were combined to balance sample size with 2017, where more cases were sampled due to higher transmission.Only children were sampled in 2015 and 2016, and these years were also combined.Primigravidity was de ned as a rst pregnancy, while multigravidity was de ned as having had at least one previous pregnancy.Population diversity was measured as H E , i.e., the probability that two randomly selected parasites carry distinct alleles at each diversity locus (n=165).It was calculated as: where n is the population size, and p i is the frequency of the i th allele, with allele frequencies estimated statistically using a Multiple Chain Monte Carlo (MCMC) algorithm from MOIRE v3.0.0 (R package) 35 .Intra-host diversity was measured using the following metrics: MOI, eMOI, 1-F ws , and proportion of polyclonal infections (eMOI>1.1).
Individual MOI and eMOI was also estimated with the MOIRE MCMC algorithm.eMOI takes within-host relatedness into account, and can be interpreted as the expected MOI if population diversity was in nite (H E =1).F ws was calculated as the allele heterozygosity of the individual relative to the population: Where n is the number of alleles detected at the i th locus of a given sample.Individual mean F ws was calculated across all diversity loci.Pairwise infection (inter-host) relatedness was estimated as IBD, i.e., the proportion of the genome shared between parasites through recent ancestry, using Dcifer v1.2.0 (R package) 36 , accounting for the presence of polyclonal infections and the probability that regions of the genome are shared by chance.Prevalence of resistance markers was calculated as the number of individuals carrying a mutated allele out of all individuals with a valid genotype for the respective locus.In case of both wildtype and mutant alleles present in one individual, the individual was considered mutant carrier if the infection was polyclonal by eMOI (eMOI>1.1),otherwise only the major allele (wildtype or mutant) was considered.For genotypes involving multiple amplicons, only samples with a single allele present were included to avoid issues with phasing.

Statistical analysis
Univariate and multivariate regression analyses were used to estimate intra-host diversity and assess the effect of factors of interest.P-values and con dence intervals for eMOI were obtained from zero-truncated Poisson regressions.Logistic regression was used for percentage polyclonal and 1-F ws .The effect size of continuous time on intra-host diversity was estimated from multivariate regressions with an interaction between time and area.To compare intra-host diversity between ANC users and children, only samples from Magude were included due to low sample sizes for children in Manhiça and Ilha Josina.H E was compared between populations with Linear Mixed Models (R package nlme) tting locus as a random effect.Random subsampling matching populations by area and year was performed to compare groups of similar sample size.Differences in mean relatedness were assessed with permutation testing.Prevalence of resistance markers was compared with Pearson's chi-square test or Fisher's exact test.Multiple comparisons were corrected for using the Benjamin-Hochberg procedure with a q-value of 0.05, resulting in a nal alpha of 0.0062 applied to indicate signi cance.All analyses were performed using R version 4.3.0.

Role of the funding source
The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report.Declarations

Figures Figure 1
Figures

Figure 3 Population
Figure 3

Table 1 .
3 .Gerlovina, I., Gerlovin, B., Rodríguez-Barraquer, I. & Greenhouse, B. Dcifer: an IBD-based method to calculate genetic distance between polyclonal infections.Genetics 222 (2022).https://doi.org:10.1093/genetics/iyac126Tables Characteristics of study participants by population group and inclusion in analysis after ltering of individuals carrying a mutant allele, N=total individuals with a valid call.In individuals carrying multiple different genotypes at a given locus, both alleles are considered valid if eMOI>1.1, otherwise only the major allele is included.p-values from Pearson's chi-square test or Fisher exact test of difference in proportion between ANC and children, depending on sample size.
All available samples were attempted sequenced.Those that did not pass the ltering criteria were excluded, while the remaining samples were included in the analysis.Population groups are 1) pregnant women attending their rst antenatal care (ANC) visit, and 2) children aged 2-10 years sampled in population-representative household surveys.n=number