Shifts in P. falciparum genetic structure and gametocyte markers in the transition to elimination

Large-scale programs targeting Plasmodium falciparum (Pf) elimination can exert strong selection pressures on the parasite population. To better understand the impact that elimination initiatives can have on Pf genetic structure and gametocyte carriage, we applied amplicon-based sequencing of two polymorphic Pf genes and quantitative reverse-transcription PCR targeting gametocyte-specic genes to Pf isolates collected in Magude District (Southern Mozambique) before and after an elimination initiative. The 71% reduction of Pf prevalence achieved in 2 years was followed by reductions in Pf genetic diversity and increases in between-infection similarity. These genetic shifts were accompanied by increases in the relative transcript number of the female mature gametocyte marker pfs25, the pfap2g transcription factor that drives gametocytogenesis and the sexual ring marker pfgexp02, suggesting the parasite ability of adapting its sexual investment during elimination initiatives. Reactive interventions that target Pf sexual stages may be required to achieve complete interruption of transmission.


Introduction
Large-scale programs such as mass drug administration (MDA) which aim to quickly reduce the malaria parasite biomass in a community and to prevent new infections for a certain period can exert intensive selective pressures on these parasites. 1 While the impact of MDA on Plasmodium falciparum (Pf) genetic markers of antimalarial resistance has been assessed in few studies, 2 there is limited evidence about the changes in parasite population structure and transmission dynamics during the critical transition to elimination. Understanding these molecular processes can guide the design new strategies to enhance the long-term success of future programs.
There is great genetic diversity within Pf populations, with people living in endemic areas being frequently and simultaneously infected by several parasite strains. 3 Bottlenecks in parasite population driven by elimination efforts are expected to reduce the genetic diversity of Pf infections and increase their similarity due to inbreeding and recent common ancestry. 4 As genetic diversity is vital for malaria parasite' ability to evade the host immunity and adapt to changing environmental conditions, 5 these shifts in the Pf population structure can affect the dynamics of infections. Theoretical 6,7 and mouse models 8,9 suggest that competitive interactions between genetically distinct malaria parasites within an infection impose greater demands on replication at the expenses of reproduction. Moreover, expression of Pf genes that regulate the parasite's replicative and reproductive programmes, such as AP2-G (the transcription factor that initiates reproduction), 10 has been shown to vary depending on the intensity of malaria transmission. 11 Therefore, changes in the in-host ecology of Pf infections after drastic changes in transmission may impact the production of gametocytes and determine the transmission success of parasites escaping elimination efforts.
Here we employed parasites collected before and after the deployment of an intensive combination of interventions in Magude district (southern Mozambique) 12 to interrogate for shifts in population structure and gametocyte carriage. We hypothesized that reductions in genetic variation, as a result of drastic reductions in transmission intensity, would lead to increases in gametocyte production due to lower within-host competition. For this genetic investigation, we utilized amplicon next-generation sequencing to evaluate genetic diversity and similarity based on the presence and relative abundance of haplotypes of two highly polymorphic Pf genes. Subsequently, we used a quantitative reverse transcriptase PCR (RT-qPCR) assay to detect transcripts of female (pfs25) and male (pfs230) mature gametocytes, 13,14 as well as of pfap2g 10 and one of its earliest responders, pfgexp02, 15 which marks recently formed sexual forms. 16

Study design and participants
The study analyzed Pf isolates collected in the rural district of Magude, located in the north-west of Maputo province, southern Mozambique, 12 before (2015) and after (2017) the implementation of the rst phase of the Magude project. This was a malaria elimination feasibility project that consisted of two MDA rounds per year (at the beginning of the rainy season [November-December] and 4-6 weeks later) over two transmission seasons (2015-16 and 2016-17), in combination with an improved surveillance system, annual blanket indoor residual spraying, programmatically distributed long-lasting insecticide treated nets and standard case management. As a result, all-age Pf prevalence by rapid diagnostic test decreased from 9.1% in May 2015 to 2.6% in May 2017. 12 50 µL dried blood spots (DBS) were collected onto Whatman 3 mm lter paper from a random selection of individuals included in an age-strati ed community-based cross-sectional survey conducted at the end of the transmission season at baseline (May 2015; n = 1035) and three months after the last MDA round (May 2017; n = 3752), 12 as well as among consenting participants prior to rst MDA round (November 2015, n = 1322). DBS collected in May 2015 and 2017 were also preserved in 250 µl RNAprotect (Qiagen, Hilden, Germany) and stored in -80ºC. A minimum sample size of 63 and 45 per group was targeted to detect a 2-fold reduction in the proportion of monogenomic infections (assumed to be 0.5 at baseline) and mean pfs25 relative transcript levels (1.33, standard deviation [SD] 1.12 17 at baseline), respectively, with a 80% power and a two-sided signi cance test of 0.05.

Procedures
DNA extracted from DBS was used for Pf detection in duplicate by real-time quantitative PCR (qPCR) targeting 18S rRNA (Appendix methods). After con rming enough parasite DNA material using a nested PCR targeting pfcrt, 2 Pf positive samples were ampli ed using gene-speci c and indexing primers (Appendix methods and table s1) targeting 2 polymorphic regions of Pf genes coding for the circumsporozoite protein (pfcsp), and apical membrane antigen 1 (pfama1). 18 Pooled libraries were sequenced in an Illumina MiSeq instrument using paired end 2×300bp reads and a MiSeq v3 (600 cycles) ow cell. The sequencing quality and performance were assessed using gDNA extracted from Pf strains (3D7, 7G8, Dd2 and HB3) mixed at different concentrations (Appendix table s2) as well as 40 Pf isolates with the complexity of infection (COI) determined on the basis of the length of pfmsp1 and pfmsp2 amplicon sizes (Appendix methods). After sample demultiplexing and removal of primer sequences, HaplotypR pipeline version 0.3 was used to de ne pfama1 and pfcsp haplotypes. Samples with less than 25 reads for any of the markers were removed. Reads from both amplicons were trimmed to remove bad quality bases and merged reads containing forward and reverse sequences were generated. A SNP was de ned as a nucleotide position with a > 50% mismatch rate in the sequence reads using Pf strain 3D7 as reference from at least two independent samples. A haplotype was de ned as a combination of alleles in an entire amplicon. A minimum of 3 reads and of 0.1% within-host haplotype frequency was required to call a haplotype. To discard false positive haplotypes, we removed chimeric or singleton reads as well as haplotypes only present at a frequency lower than 1%.
RNA extracted from Pf-positive lter papers stored in RNAprotect was subjected to reverse transcription using PrimeScript™ RT Master Mix reagents (Takara). cDNA levels of pfs25, pfs230, pfap2g and pfgexp02 genes, as well as the PF08_0085 (ubiquitin-conjugating enzyme) housekeeping (HK) gene, were assessed using a RT-qPCR 13,14,16,17 (Appendix methods). Samples with a Ct>40 were considered as not ampli ed. The transcript copy number of each gene was quanti ed by extrapolation against a standard curve of ve serial dilutions of gDNA obtained from the 3D7 strain containing known numbers of ringinfected erythrocytes. Zeros were eliminated from the copy number data by adding to all values half of the limit of detection (LOD) of the qPCR, as calculated by the lowest concentration of target gene that can be detected in triplicate serial dilutions of 3D7 gDNA (Appendix table s3). The relative copy number (RCN) was calculated as the ratio of transcripts detected from target and HK genes.

De nitions and data analysis
Age was categorized as younger or equal to 15 years or older, and parasite densities quanti ed by qPCR as lower than 200 parasites/µL or higher. Private haplotypes were designed to those only present in preor post-MDA populations, and rare haplotypes to those occurring just once in the complete set of samples analyzed. Haplotypes detected were used to compute the COI (i.e. number of haplotypes present in an infection) and the presence of monogenomic infections (i.e. if a sample has one haplotype per locus).
The relative abundance of a given haplotype, de ned as the number of reads over all remaining reads within a sample, was used to calculate the Shannon-index 19 as where R is the number of haplotypes in a sample and p i is the frequency of haplotype I.
Genetic similarity of Pf populations was quanti ed through several metrics which take into account the presence of haplotypes and also their relative abundance. Binary sharing indicates whether a common haplotype exists between two samples; Jaccard distance measures the fraction of haplotypes that two samples have in common with respect to all the haplotypes common in any of the samples; and Pearson correlation coe cient measures the linear correlation of the haplotype frequencies between two samples as where x and y are the haplotype frequencies of two samples, i represents the ith haplotype, and σ X ,σ Y are the standard deviations of the two samples (Appendix methods).
Categorical and continuous data were compared between pre-and post-intervention using Fisher's exact test and Student t test, respectively. qPCR parasite densities and RCNs were log transformed. Logistic, negative binomial and linear regressions were used for the analysis of binary, counts and continuous data, respectively. Univariate and multivariate regression models adjusted by age and parasite density were conducted to assess the association between genetic and transcriptional variables with study period. A Jack-Knife resampling approach was used to calculate the variance of each similarity metric (binary sharing, Jaccard distance and Pearson CC) as well as of the fraction of sample pairs above different similarity thresholds. From this, error propagation was used to obtain the error of the differences between groups. Statistical tests were performed using Stata software version 15 (Stata Corp, College Station, Tx) and Python 3 programming language. The statistical signi cance was de ned as a p value<0.05.

Ethics
Study ethical approval was obtained from the National Mozambican Ethical Review Committee (Mozambique) and Hospital Clínic (Barcelona, Spain) ethics review committees, and signed written informed consent was obtained from all participants or from guardians/parents in the case of minors. Seventeen samples were excluded due to having less than 25 reads either in pfama or pfcsp, and 2 were further discarded due to incomplete demographic data (Fig. 1). Ninety (89.1%) and 80 (57.6%) of the Pfpositive samples from May 2015 and May 2017, respectively, were available in RNA protect and therefore used for the analysis of gametocyte-speci c markers. Among these, 51 (56.7%) and 57 (71.2%) ampli ed the HK gene (Fig. 1), and one pre-intervention sample was excluded due to incomplete demographic data.

Study population and samples analyzed
Study participants before and after the intervention from whom genetic and transcriptional data was available were similar in area of residence, age, sex and qPCR-quanti ed Pf densities (Table 1). Also, the selected participants were representative of the overall population of Pf-infected individuals detected by qPCR, except for parasite densities which were higher among those with successfully analyzed Pf infections (Appendix table s4). Appendix gure s1). After haplotype assignment and quality ltering of reads, we identi ed 30 SNPs for pfama1 and 24 for pfcsp. In total, 45 distinct haplotypes were detected at the pfama1 locus and 57 at the pfcsp locus ( Fig. 2A and B Fig. 2C). Pre-intervention private haplotypes were detected in less samples than shared haplotypes (Fig. 2D).

Between-infection Pf similarity indices
We employed 3 metrics that describe genetic similarity between pairs of Pf infections: binary sharing, Jaccard distance and Pearson CC. There was no evidence of statistically signi cant differences in the proportion of sample pairs that shared at least one haplotype (binary sharing; Appendix gure s3 and Appendix table s9). Pearson CC and Jaccard distance (Fig. 4A, B, C and D) of pfama1 increased from 2015 to 2017. Similarly, the fraction of sample pairs with high Pearson CC (> 0.9999) increased after the intervention (Fig. 4E) for both pfama1 (from 1.03-4.97%, p < 0.001) and pfcsp (from 1.03-3.4%, p = 0.019). Similar trends but not statistically signi cant were observed for highly similar parasites de ned on the basis of Jaccard distance (Fig. 4F). No correlation was observed between genetic relatedness and gender, area, use of antimalarial, age and parasite density (Appendix gure s4).

Discussion
The modulation of Pf reproductive investment in response to changes in transmission is expected from theory 6,7 and from the effect of environmental stressors on gametocyte conversion rates in experimental models 20 . Here we show that a 71% drop in Pf prevalence achieved in 2 years through the scale up of intervention in southern Mozambique is associated with reductions in Pf genetic diversity and increases in parasite similarity, overall indicating a severe bottleneck in the parasite population. To the best of our knowledge, this investigation presents for the rst time evidences of an increased expression of gametocyte-speci c genes accompanying this shift in parasite genetic structure. These changes in the Pf transmission potential during the transition to elimination may affect the success of subsequent interventions aiming to achieve complete interruption of transmission.
Genetic diversity of the Pf population in Magude district decreased after deploying an intensive package of interventions, as indicated by the consistent decline in pfama1 and pfcsp Shannon diversity index, as well as the reductions in pfama1 COI and frequency of polygenomic infections. This decline in genetic diversity was not due to differences in sequencing depth, as parasite densities and sequencing coverage were similar in both study periods. The decrease in the overall number of detected haplotypes and the loss of the rarest haplotypes rather suggests that the severe bottleneck in the parasite population probably drove this reduction in genetic diversity. Drops in diversity were accompanied by enhanced between-infection genetic similarity in parasites surviving the intervention, as evidenced by the proportion of highly-similar parasite pairs (Pearson CC > 0.95) which increased from 2-8% for pfama1 and from 3-6% for pfcsp. Our results suggest that genetic parameters may be useful to monitor the success interventions deployed at the community level, 21 especially when the parameters take into account the relative abundance of haplotypes (i.e., Shannon index and Pearson CC). Importantly, less than 100 samples per group were enough to detect signals of a bottleneck in the parasite population driven by a 71% drop in Pf prevalence. 12 Moreover, the enhanced genetic similarity after the bottleneck in the parasite population may increase the performance of genetic tools to identify imported infections with unrelated haplotypes. 22 The reduction in Pf within-host genetic diversity and the enhanced between-infection similarity of infections after the drastic drop in transmission were coincident with an increase in the detectability and levels of Pf transcripts speci cally expressed by female gametocytes (pfs25), sexually committed (pfap2g) and recently formed gametocytes (pfgexp02). Similar increases in gametocyte carriage have been described after reductions in transmission achieved through the use of a combination of interventions, including insecticide spraying and MDA, in Nigeria (Garki project) in the 1970s 23

and in
Netherlands New Guinea in the 1950s. 24 Reductions in the within-host diversity might have contributed to this plastic response if less competitive interactions eventually favors gametocyte investment at the expenses of asexual replication. 6,7,25 This could partly explain the low parasite densities in historical endemic areas that have experienced transmission reductions, but where immunity is still present, especially among adults. 26 Such counter-adaptation would allow the parasite to recover from severe bottlenecks by ensuring the continuation of some level of anti-disease immunity (therefore less detection and less treatment), while at the same time betting on its transmissibility to increase the population size. Alternatively, higher gametocyte conversion rates may have been selected by reductions in the Anopheline population driven by vector campaigns deployed during the elimination initiative, such as intra-door residual spraying. Further studies are required to assess if changes in sexual differentiation after sharp declines of malaria transmission are mainly driven by plastic responses or adaptive selection of Pf genes such as pfgdv1 [27][28][29] . Independently of the mechanisms used by the parasite, the higher investment in gametocyte production observed in this study suggest that reactive focal MDA approaches with gametocytocidic antimalarials such as primaquine, coupled with glucose-6-phosphate dehydrogenase de ciency testing, 30 may increase the impact of strategies aiming to interrupt or prevent the reestablishment of malaria transmission. 31 The results of this study are subjected to several limitations. First, Pf infections that were successfully analyzed correspond to those in the upper range of parasite densities, given the limit of detection of the transcriptional and sequencing approaches. 13,18 Second, the similarity metrics used were based on the coincidence of individual haplotypes between infections rather than on identity by descent. Third, we cannot discard that other factors may have contributed to this reduction in within-host genetic diversity, such as expansion of a small number of alleles with increased success in the new transmission condition, a shorter time to acquire multiple infections after the elimination initiative or selection of pfama1 and pfcsp speci c variants due to changes in host's immunity. 32 However, the contribution of these other factors were mitigated in our study owing to the restricted temporal scale of sampling and the abundance of unique haplotypes at each locus. Fourth, a few parasite isolates remained for genetic analysis after the marked reduction in the incidence of Pf malaria in Magude, although numbers were enough to detect changes in diversity and transcriptional markers. Finally, the study focused on the short-term impact of the intervention and further studies are needed to determine the duration of these changes.
In summary, this is the rst study to show that when malaria control programmes lead to a drastic reduction in transmission intensity, diversity declines while similarity and gametocyte production increases, supporting the concept that Pf can facultatively trade off investment in sexual transmissible stages to maximize their within-host competitive ability. 8,9 If associated with increased infectivity, this higher carriage of sexual stages may increase the ability of infections that survive elimination efforts to expand and recover to baseline levels, unless interventions that target the Pf sexual stages are deployed to sustain gains and eventually achieve interruption of transmission. The identi cation of biological mechanisms involved in these increases in gametocyte production can allow the design of new approaches for early programmatic reactions in settings approaching elimination.
Declarations support from the Spanish Ministry of Science and Innovation through the "Centro de Excelencia Severo Ochoa 2019-2023" Program (CEX2018-000806-S), and support from the Generalitat de Catalunya through the CERCA Program This research is part of ISGlobal's Program on the Molecular Mechanisms of Malaria, which is partially supported by the Fundación Ramón Areces. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Authors' contributions HG carried out the molecular determinations, and together with CR, JRG and AP analyzed the data. PC, AC, GM contributed to the analysis of samples. FS, PA, BG, PLA, LN, NRR, WS participated in the design of the interventions, eldwork and collected clinical and epidemiological data together with the biological samples. AM, HG and BG participated in the study design. AM conceptualized the study, interpreted the results and wrote the draft of this manuscript. All authors read and approved the nal manuscript.

Data availability
Anonymized data used for this analysis is available upon request.