Evolution of Plasmodium vivax populations in border areas of the Greater Mekong Subregion during malaria elimination

Background: Countries within the Greater Mekong Subregion (GMS) of Southeast Asia have committed to eliminating malaria by 2030. Although malaria situation has greatly improved, Plasmodium vivax remains at international border regions. Therefore, to gain a better understanding of transmission dynamics, knowledge on the evolution of P. vivax populations after the scale-up of control interventions will guide more effective targeted control efforts. Methods: We investigated genetic diversity and population structures in 206 longitudinally collected P. vivax clinical samples in two international border areas at the China-Myanmar border (CMB, n=50 in 2004 and n=52 in 2016) and western Thailand border (n=50 in 2012 and n=54 in 2015). Parasites were genotyped using 10 microsatellite markers. Results: Despite intensified control efforts, genetic diversity in the four populations remained high ( H E = 0.66-0.86). The proportions of polyclonal infections showed substantial decreases to 23.7 and 30.7% in the CMB and western Thailand, respectively, with corresponding decreases in the multiplicity of infection. Consistent with the shrinking map of malaria transmission in the GMS over time, there were also increases in multilocus linkage disequilibrium, suggesting of more fragmented and increasingly inbred parasite populations. There were considerable genetic differentiation and subdivision with the four tested populations. Various degrees of clustering were evident between the older parasite samples collected in 2004 at the CMB with the 2016 CMB and 2012 Thailand populations, suggesting some of these parasites had shared ancestry. In contrast, the 2015 Thailand population was genetically distinctive, which may reflect a process of population replacement. The moderately large effective population sizes and proportions of polyclonal infections highlight the necessity of further coordinated and integrated control efforts on both sides of the borders in the pursuit of malaria elimination. Conclusions: With enhanced control efforts on malaria elimination, P. vivax population in the GMS has fragmented into a limited number of clustered foci, but the presence of large P. vivax reservoirs still sustains genetic diversity and transmission. These findings provide new insights into P. vivax transmission dynamics and population structure in this area.


Background
As the global incidence of malaria has been greatly reduced in recent years, Plasmodium vivax has become the main source of malaria infections in co-endemic areas [1]. This is also true for most areas of the Greater Mekong Subregion (GMS) in Southeast Asia [2], which is actively pursuing regional malaria elimination. This shift in species dominance in the face of intensified control efforts highlights the remarkable adaptive potential and relative resilience of P. vivax to control measures [3]. In recognizing the challenge for eliminating vivax malaria, the GMS countries planned to eliminate falciparum malaria by 2025 and all malaria by 2030 [4]. As the malaria elimination plan progresses in the GMS, malaria displayed increasing heterogeneity in distribution with transmission being concentrated along international borders [5]. These international borders are porous with intensive human migration, which poses a major threat of parasite introduction [6][7][8]. In addition, border regions have drastically different ecology, vector systems, human populations, and are subject to influences of wars and civil unrest; thus malaria transmission in border regions are often unstable [9].
While malaria surveillance has been strengthened in the GMS countries, it can be complemented by population genetic studies to define parasite diversity, population structure, and migration among geographically separated transmission hotspots [10].
Since a major mechanism of generating genetic diversity in the malaria parasites is meiotic recombination of parasite strains in mosquito vectors, genetic diversity is intrinsically linked to the transmission intensity [11]. Likewise, it is expected that reduction of parasite population in response to intensified interventions will lead to reduced parasite genetic diversity. However, P. vivax seems to defy this principle. In most of its geographic range, P. vivax showed high levels of genetic diversity as revealed by population genetic studies using various genotyping markers [12][13][14][15][16][17][18][19][20][21]. Paradoxically, even in low transmission settings, P. vivax still maintains high levels of diversity [22][23][24][25][26]. As a result, in areas of co-endemicity, P. vivax and P. falciparum exhibit markedly different genetic diversity and population structures, with P. vivax showing more stable transmission patterns [14,15,[27][28][29]. Newly introduced parasites in areas where malaria had been eliminated may initially display low-level genetic diversity or even clonality [30,31], but the parasites could rapidly reestablish diverse populations [32]. As a result, the shrinking P. vivax populations in areas with intensified control frequently showed relatively high genetic diversity, but substantial levels of multilocus linkage disequilibrium (LD) and population substructure [15,17,23,33].
The scale-up of malaria control efforts in countries of the GMS has led to substantial changes in malaria epidemiology with a noticeable rising proportion of vivax malaria [5,34]. In the China-Myanmar border (CMB) areas, P. vivax not only has become the predominant Plasmodium species in recent years [35], but also has caused several malaria outbreaks [9,36]. Even in the international border regions, malaria transmission is concentrated in separated hotspots, and among which gene flow is expected to be low. Final attacks to eliminate these hotspots, guided by accurate surveillance, will ensure the success of the regional elimination program. An improved understanding of the transmission dynamics and the adaptive responses of the parasites to the control measures will also be essential to guide and monitor the progress of the elimination campaign. Recent studies of P. vivax populations from international border regions of Thailand with microsatellite (MS) markers clearly detected parasite population division, consistent with the separation of these parasite populations by the malaria-free central region [13,37]. Here we employed longitudinally collected P. vivax clinical samples to determine the effect of intensified control efforts on genetic diversity and evolution of the parasite populations in two border regions of the GMS.

Ethics statement
Written informed consent was obtained from all P. vivax patients or their legal guardians for participants under the age of 18 years. This study was approved by the Institutional Review Boards of Department of Disease Control, Thailand Ministry of Health and the Pennsylvania State University.
Use of the de-identified blood samples was approved by the Ethics Review Committee of China Medical University.

Study sites and P. vivax clinical samples
In order to investigate the evolution of the P. vivax population in response to intensified malaria control efforts in the GMS, P. vivax isolates were longitudinally collected from two different areas of the China-Myanmar and Thailand-Myanmar borders (S1 Fig). The CMB area has been traditionally malaria hyperendemic with both P. falciparum and P. vivax transmission. Although recent malaria control efforts have sharply reduced the P. falciparum incidence [35], P. vivax incidence remained consistently high and even experienced outbreaks in recent years [9,36]. In the CMB, 50 samples were collected in 2004 to represent the parasite population prior to the implementation of malaria elimination measures, while 52 samples collected in 2016 represented the parasites after scaling-up of malaria control. In Thailand, Tak province has been among the most malaria-endemic areas. In the past decade, malaria incidence in Tak has experienced more than 10-fold reduction, from 13,706 in 2012 to 1364 in 2016. Fifty P. vivax clinical samples obtained in 2012 from Tak Province represented the parasites before the endorsement of malaria elimination plan, and were used to compare with samples collected from the same area in later years [13]. Finger-prick blood samples were collected from symptomatic patients attending local malaria clinics after obtaining informed consent. Malaria diagnosis was based on microscopy of Giemsa-stained thin and thick smears. For confirmed P. vivax cases, ~100 μl of finger-prick blood were spotted onto Whatman filter paper, dried and stored in individually zipped plastic bags.

Microsatellite genotyping
Parasite genomic DNA was isolated from the dried blood spots on filter paper according to the protocol of QIAmp DNA Mini kit (Qiagen, Hilden, Germany). The final purified DNA was eluted into 35 ml of elution buffer and used immediately or stored at -20 °C until further use.
A volume of 2 ml of purified DNA was used as the template for malaria parasite detection by a genusspecific and species-specific nested PCR targeting the Plasmodium 18S rRNA genes [38]. Ten MS markers (MS1, MS2, MS5, MS6, MS7, MS9, MS10, MS12, MS15 and MS20) previously used to differentiate P. vivax populations in Thailand [13] were used for genotyping these P. vivax samples. A multiplex primary PCR was done using all 10 primer pairs and followed by singleplex secondary PCR with a fluorescently-labelled (6-FAM, VIC or NED) forward primer as previously described [39]. PCR products were used for GeneScan fragment analysis on an ABI3730xl capillary electrophoresis platform (Applied Biosystems) using the size standard LIZ500. Genotype calling was facilitated with GeneMapper Version 4.0. The predominant allele and any additional alleles with peak height at least one-third of the height of the predominant allele per locus were scored [40]. Genotyping success was defined as the presence of at least one allele at a given locus in a given sample.

Data analysis
Isolates containing more than one peak for any marker were considered to be multiple clone infections. The multiplicity of infection (MOI) was defined as the maximum number of alleles observed at any of the loci investigated. The mean MOI was calculated from the individual samples for each study site. Alleles were binned using the TANDEM software [41]. Isolates with one allele at all markers, were considered monoclonal infections. An infection was defined as polyclonal if more than one allele was observed at one or more loci. For isolates with more than one allele at any of the loci, the alleles with the highest peak were used to construct the dominant haplotypes as previously described [14]. Input files for the various population genetics software programs were created using CONVERT version 1.31.

Population diversity and differentiation
The indices of genetic diversity within populations, such as the number of polymorphic loci, the number of haplotypes (Nh), the number of alleles (Na), the mean allelic richness, and the expected heterozygosity (H E ) were calculated using Arlequin version 3.11 [42] and GenAlEx version 6.5 [43]. In order to assess the genetic differentiation among populations, pairwise comparisons were measured by calculating F ST using GenAlEx version 6.5. To estimate the partitioning of genetic variance for different hypothesized population groupings, we performed analysis of molecular variance (AMOVA) using GenAlEx version 6.5. The correlation between geographic and genetic distances was calculated by Mantel rank test in GenAlEx version 6.5.

Linkage disequilibrium (LD)
Multilocus LD in each population was calculated using the program LIAN 3.7 [44] with 50,000 iterations for burn-in and then 100,000 Markov Chain Monte Carlo (MCMC) iterations. Samples with missing data were excluded for this analysis. To avoid detecting false inbreeding resulting from clonal propagation and physical linkage, this analysis was performed for the combined dataset of single and dominant haplotypes and for unique haplotypes only. Since MS2 and MS5 both localize to chromosome 6 and MS12 and MS15 to chromosome 5, MS2 and MS12 were excluded as they had higher levels of missing data.

Bottleneck analysis
A heterozygosity excess test at the population level was used to detect the recent population bottleneck using BOTTLENECK 1.2.02 [46] with the SMM [47]. The Wilcoxon signed rank test, a robust statistic for testing less than 20 polymorphic loci, was executed in the model in order to ascertain the probability of significant heterozygote excess. Since the method implemented in the bottleneck has low power [48] unless the decline is greater than 90%, the Garza-Williamson index was computed using Arlequin version 3.11. The Garza-Williamson index is the mean ratio of the number of alleles at a given locus to the range in allele size, i.e., M = (k/r), where k is the number of alleles and r is the allelic range (i.e., the difference in repeat units between the shortest and the longest alleles at a locus) [49]. This measure is based on the assumption that in a bottleneck event, the number of alleles decreases faster than the allelic range because the latter is only reduced if the shortest and/or longest allele is lost, whereas the loss of any allele reduces the former. For the Garza-Williamson index, M < 0.68 indicates a bottleneck, whereas M > 0.80 indicates no reduction of effective population size.

Population structure
Deviations from Hardy-Weinberg equilibrium could indicate the presence of population structure or inbreeding [50]. To investigate whether haplotypes cluster into distinct genetic populations (K) among the defined geographic areas, we conducted a Bayesian analysis of population structure using STRUCTURE version 2.3.2 [51]. The admixture model was used and the posterior probability of the grouping number (K = 1~10) was estimated by the MCMC method with 10 separate runs to evaluate the consistency of the results. Each run was estimated as 10,000 steps with 100,000-step burn-in.
The best-fit number of grouping was evaluated using ΔK in the STRUCTURE HARVESTER version 0.6.93 tool [52,53]. The CLUMPP version 1.1.2 [54] and DISTRUCT 1.1(Rosenberg NA, 2004) were used to display the partitioning clusters.
To visualize genetic relationships among the parasite isolates from the four populations, an individualbased principal coordinate analysis (PCoA) was conducted in GenAlEx version 6.5 using the genetic distances among MS genotypes. In addition, phylogenetic relationships amongst P. vivax isolates were analyzed using the Neighbor-Joining method [55] implemented in MEGA7 [56]. Minimum spanning tree of parasite genotypes constructed by the goeBURST algorithm using the Phyloviz software v1.1 (http://www.phyloviz.net/)

Within-host and Population Diversity
A total of 206 P. vivax isolates from two areas of the GMS were genotyped at 10 MS markers.
Complete genotyping data for all 10 loci were obtained for 166 (80.5%) isolates. Samples with fewer than five MS markers genotyped were removed from analysis. The predominant allele of each locus in each sample was used for population genetic analysis. The 10 MS markers all were highly polymorphic with each one having 2 -20 alleles when all parasite populations were considered (S1 Table). Allele frequencies varied among the four populations. In the CMB samples, MS2 was the most diverse in 2004 with 20 alleles, whereas MS7 was the most diverse in 2016 with 16 alleles, followed by MS9 with 15 alleles. In the 2012 samples in western Thailand, MS20 was the most diverse with 18 alleles, followed by MS2 with 16 alleles. The 2015 samples from three provinces of Thailand also had MS2 as the most diverse marker [13]. Therefore, MS2 appeared to be among the most diverse markers in malaria-endemic areas in the GMS. In comparison, the least diverse MS markers differed among the populations: MS6 in the 2004 CMB population with 6 alleles, MS20 in the 2016 CMB population with two alleles; MS6 and MS12 equally least diverse in 2012 western Thailand with 7 alleles; and MS1 in 2015 western Thailand with 6 alleles (S1 Table).
Overall, 72 (34.9%) parasite isolates contained polyclonal infections (more than one peak for at least one MS marker), and the proportions of multiclonal infections were significantly different among the four parasite populations (P < 0.05, Pearson Chi-square test) ( Table 1) Table 1). The genetic diversity was correspondingly reflected in allele richness in these parasite populations ( Table 1) Despite the overall reduction in MOI, the genetic diversity of parasite populations at both sites remained high (Table 1). Haplotype diversity was high; a total of 162 haplotypes were observed and no haplotypes were shared among the four populations. The 2015 Thailand parasite population harbored the greatest number of haplotypes ( Table 1) (Table 1). This high genetic diversity may reflect substantially large parasite populations sustaining continued transmission at these two border areas. With both the SMM and IAM models, the effective population size N e at the CMB was moderate and showed a ~2-fold decrease over the past decade, whereas N e in western Thailand remained high and even had a ~2-fold increase in recent years ( Table 2).

Multilocus LD and Population Bottlenecks
Though haplotype sharing among the four parasite populations was not detected, the 2004 CMB and 2015 Thailand populations had low degrees of haplotype sharing within the populations. In the 2004 CMB population, two parasite isolates carried the same haplotypes 13 and 22, respectively (S2 Table). In the 2015 Thailand parasite population, identical haplotypes 25, 79, 92, and 109 were shared by two, two, three, and four parasites, respectively [13]. The standardized index of association I S A used to measure the degree of inbreeding revealed significant multilocus LD in the P. vivax populations from the both CMB (P < 0.0001) and western Thailand (P < 0.0001) ( Table 3) (Fig 1A), which corroborated the weak genetic differentiation between these two populations from F ST analysis (F ST = 0.064, Table 5). At K = 5, parasites in 2004 from the CMB and 2012 from Thailand were substantially admixed (Fig 1A), suggesting similar ancestry for parasites from the CMB and 2012 Thailand.  (Fig 1B). Such a finding was further illustrated with phylogenetic and network analyses.
The phylogenetic tree also supports the finding of substantial regional population structure and local clustering of haplotypes (Fig 2A). Clades 1 and 2 contained parasites predominantly from the CMB 2004 and 2016, which more or less reflected the lower right coordinate of the PCA analysis to show extensive clustering of parasites collected most recently from Myanmar with earlier parasite samples from the rest of the GMS (Fig 1B). Clade 3 and 4 parasites were predominantly from Thailand collected in 2012 and 2015, respectively. Similarly, network analysis roughly divided parasites into four clusters, with each cluster being dominated by parasites from a single population (Fig 2B). phenomenon that has been reported in numerous P. vivax-endemic areas [22][23][24]28]. Reason for this is not completely clear but is considered to be multifactorial [61]. Of direct relevance to the border settings is the mobility of the camp and border populations along the borders, which would introduce new parasites and enhance genetic diversity of the parasite populations. Our earlier study provided evidence on the existence of extensive gene flow among border communities and across international borders [16]. Unlike international borders are present as a potential barrier for the spread of P.

Cluster 4 contained mostly parasites from western
falciparum strains in the GMS [62], landscape factors at the CMB do not appear to impede gene flow [16]. Cross-border frequent human migration facilitates the importation of malaria into lowtransmission areas and this represents a major challenge to malaria elimination in the GMS [63].
Unlike the relatively stable genetic diversity over time, both border regions showed substantial reduction (17-18%) in the proportion of polyclonal infections and MOI. However, at the CMB, the proportion of polyclonal infections was unchanged compared to that determined in clinical P. vivax samples collected earlier. Nevertheless, compared to the hypoendemic settings in central China and Malaysia and hyperendemic settings such as Papua New Guinea, Cambodia, Indonesia and Ethiopia [15,22,[64][65][66][67], the proportions of polyclonal infections in parasite samples from recent years remained moderately high (23% and 30%) in these two border regions of the GMS. In addition, parasite populations in these regions remained relatively large, especially at the western Thailand border. Multiclonal infection, together with the large population size, will increase the chance of recombination, resulting in the observed high genetic diversity. It is therefore important to further enhance local malaria control efforts to achieve substantial reduction of the parasite populations in these transmission hotspots.
Another noticeable change is the detection of significant multilocus LD in the parasite populations despite high genetic diversity. This again seems to be a common finding in many vivx low-endemicity settings such as Indonesia, Papua New Guinea, South Korea, India, Vietnam, Colombia, and Brazil [15,59,61] [32, 68,69]. Significant LD against a background of high diversity may reflect the existence of multiple spatially clustered infections within a defined population, which might have arisen from rapid reduction in transmission and effective population size as malaria control interventions have intensified in the GMS. The overall trend of increasing LD in both sites and elsewhere in Thailand [13] is a demonstration of reduced overall parasite genetic complexity during malaria elimination.
Across different regions of Thailand, we have detected substantial parasite population differentiation [13]. This study also detected considerable divergence of the parasite populations across both time and space. There were no haplotype sharing among the four populations, and most of the parasites from individual populations fell into distinct clades or clusters from phylogenetic and network analyses, demonstrating that the parasite populations in the GMS have become fragmented with increasingly inbred. International border regions of the GMS showed high malaria transmission areas that were largely connected a decade ago [70]. Consistently, the older parasite population from the CMB 2004 showed remarkable degrees of clustering with the CMB 2016 and Thailand 2012 samples, which may reflect shared ancestry of these parasites in the recent past. In contrast, the current malaria map illustrated the presence of separated pockets of transmission foci from intensified control efforts of the malaria elimination campaign, which limit gene flow within GMS. The genetic differentiation between the contemporary parasite populations is an indication of independent evolution of these isolated parasite populations as also documented for the P. falciparum parasites [71]. In addition, heterogeneity within vector populations could affect transmission [72] and population structure of the parasites [73]. The notable shift in vector populations observed in western Thailand may also be responsible for shaping the parasite population structure [74]. Moreover, the parasite populations studied here have undergone apparent population bottlenecks, suggesting of clonal expansion of parasite isolates, or limited vectors or human movement within the defined geographic areas. Relaxation of control efforts could lead to clonal expansion of the residual or introduced parasites, as has been identified in southern Thailand [31,75]. At the CMB, the epidemic malaria transmission documented in the IDP camps and surrounding villages in 2013 and the detected reduction of effective population size may result from expansion of certain parasite isolates such as those that are resistant to the frontline treatment [76,77]. Interestingly, the Thailand 2015 samples were quite genetically distinct, and even differed considerably from the 2012 parasite from the same region, which may suggest population replacement. This is plausible as the border parasite populations are subject to extensive parasite introduction [6], and introduced parasites are capable of re-establishing a more diverse population within a short period [32].

Conclusions
Overall, this investigation demonstrated that the scale-up of malaria control interventions for malaria elimination in the GMS has resulted in fragmented P. vivax populations with increased inbreeding.
However, the persistently high genetic diversity, moderate levels of polyclonal infections, and considerably large effective population sizes demand concerted efforts of border nations to implement more effective measures targeting this resilient parasite. Especially in western Thailand, both parasite genetic diversity and effective population size have experienced slight increases in recent years. The large degree of population movement, rapid ecological changes and complex vector population dynamics in western Thailand are very conducive to sustained malaria transmission [78].
With a large proportion of infection linked to cross-border travel, it will be important to strengthen malaria control measures on both sides of the border. At these low-endemicity setting, the identification of the source of residual infections and implementation of targeted control may be critical for the final phase of elimination [79,80]. Our findings provide novel insights for malaria surveillance and enhanced control efforts especially targeting the fragmented populations to reach the goal of malaria elimination by 2030.

Availability of data and materials
The datasets supporting the conclusions of this article are available in Additional files.

Ethics approval
Ethical approvals were obtained from the institutional review boards of China Medical University, Thai Ministry of Health and Pennsylvania State University.
# MS2 and MS12 were excluded from analysis.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. S1