Genetic diversity of Vietnam pfmsp1 block II
A total of 377 pfmsp1 block II sequences were successfully obtained from Vietnam P. falciparum, with MAD20 alleles being overwhelmingly predominant (375/377, 99.47%). Each of the K1 and RO33 alleles was identified only once, respectively (Fig. 2). Block II of the MAD20 alleles comprised various combinations and arrangements of seven distinct peptide repeat motifs (PRMs) of SGG, SVT, SVA, SKG, SSG, PVA, and TVA, yielding a total of 25 unique alleles of MAD20 (A1–A25). None of the allele shared the same PRM configuration with the reference sequence (X05624.2). The construction of each MAD20 allele varied, containing PRMs ranging from 1 to 14. The two alleles, A14 and A20, were the most prevalent, accounting for 180 and 139 sequences, respectively. Block II of the K1 allele also demonstrated genetic variation relative to the reference sequence (NC_004330.2) and contained only three PRMs: SAQ, SGT, and SGP. In contrast, block II of the RO33 allele was identical in sequence to the reference (M55001.1).
Genetic differences of pfmsp1 block II in the global population
The allelic diversity of Vietnam’s pfmsp1 block II was compared with those of other countries and previously reported Vietnam pfmsp1 (Fig. 3). The MAD20 alleles were predominant across all Vietnam pfmsp1 populations, a trend similarly observed in pfmsp1 populations of GMS countries, including Thailand and Myanmar, as well as Pacific countries such as the Philippines, PNG, and SI. Conversely, K1 alleles were primarily found in pfmsp1 populations from India, Africa, and South American countries. Notably, the RO33 allele was predominant in the Vanuatu pfmsp1 population. The global pfmsp1 population displayed significant genetic heterogeneity in MAD20 and K1 alleles, attributed to various compositions and arrangements of PRMs. For MAD20 alleles, 10 different PRMs including SGG, SVT, SVA, SKG, SSG, PVA, TVA, SGT, SGA, and SVG were identified globally (Fig. 4). These PRMs were unequally distributed, with SGG, SVT, and SVA universally present across all populations. Additionally, SKG and SSG appeared in Asia, the Pacific, and Africa, but were absent in South American populations. The five PRMs, namely PVA, TVA, SGT, SGA, and SVG, were uniquely identified in either Vietnam or Indian populations, with PVA and TVA being novel PRMs first detected in the Vietnam MAD20 alleles analyzed in this study. A similar variation in PRM distribution was found in global K1 alleles (Fig. 4). A total of 17 distinct PRMs were detected in global K1 alleles. Among these, SGT and SGP were consistently observed across all populations, while SAG was present in K1 alleles of all countries analyzed except Brazil. SGA occurred in some countries across Asia, Africa, and South America, but not in the Pacific region. The remaining 13 PRMs showed unique occurrence in specific countries such as India, Kenya, or Tanzania. These PRMs contributed to various combinations and arrangements within global MAD20 and K1 alleles, resulting in significant size variations and genetic heterogeneity of pfmsp1 by country (Fig. 5). Worldwide, MAD20 alleles varied immensely, containing 1 to 19 PRMs, with prevalent sizes ranging from 9 to 15 PRMs in the global MAD20 population. In comparison, global K1 alleles exhibited even greater size diversity due to different compositions of PRM compositions, ranging from 4 to 25. Each country displayed variability in the number of PRMs present, with pronounce size variations of PRMs in both MAD20 and K1 alleles notably in pfmsp1 from India and African countries.
Genetic diversity of Vietnam pfmsp2 block III
A total of 289 Vietnam pfmsp2 block III sequences were successfully obtained from the samples analyzed in this study. These sequences were categorized into 3D7 types. They displayed polymorphic characters, forming 7 distinct alleles (A1–A7) distinguished by sequence polymorphisms (Fig. 6). The A4 and A2 alleles were predominant, accounting for 52.2% (151/289) and 41.9% (121/289), respectively. In the E1 region, only four amino acid changes (T44E, N47K, P48T, and P49S) were observed, categorized into two groups of paired amino acid substitutions. Specifically, the T44E/P49S pair was present in 280 sequences (96.9%), while the N47K/P48T pair was found in 9 sequences (3.1%). Different numbers, types, and arrangements of PRMs were observed in the R1 region, contributing to the size polymorphisms of Vietnam pfmsp2 block III. GAGGSGSA, GGSGSA, and GAGASGSA served the fundamental units of PRMs. In the R2 region, all sequences exhibited poly-threonine (poly-T) signature characteristic of the 3D7 type. Alleles A1–A6 contained 8 threonines (T8), while A7 featured 14 threonines (T14). Compared to the 3D7 reference sequence, a notable characteristic in the E3 region of all Vietnam pfmsp2 was the insertion of 11 amino acids at position 156: PKGKGEVQKPN for A1–A6 alleles and PKGNGGVQEPN for the A7 allele. Additionally, an amino acid change of E154K was identified in the A7 allele.
Genetic differences of pfmsp2 block III in the global population
The distribution of alleles in the current Vietnam pfmsp2 was compared to pfmsp2 from Vietnam (1994 and 2017‒2019) and other countries including Myanmar, Thailand, India, PNG, and Gambia (Fig. 7). All populations exhibited the presence of both allelic types of 3D7 and FC27 in a substantial proportion except the current Vietnam and PNG. The previously reported Vietnam pfmsp2 (Vietnam 1994 and 2017‒2019) harbored both 3D7 and FC27, though they shared overlapping collection sites with the present study. Significant genetic diversity was observed in the global pfmsp2 block III. The regions of R1, R2, and E3 were the primary contributors to genetic diversity among and between pfmsp2 populations (Fig. 8). The R1 region exhibited exceptionally diverse polymorphisms, characterized by 16 different types of PRM with unequal arrangements and repetitions (Fig. 8A). Three Asian pfmsp2 populations, Myanmar (2013‒2015), Thailand, and India, possessed all PRM types. The PNG and Gambia populations also included 14 and 11 different PRM types, respectively. Notably, two Vietnam pfmsp2 populations exhibited distinct patterns. While the Vietnam pfmsp2 analyzed in this study displayed only 3 PRM types, Vietnam (1994) pfmsp2 comprised 10 different PRM types. In the R2 region, only two types of poly-T (T8 and T14) were identified in the Vietnam pfmsp2 population, whereas a variety of poly-T types were observed in the global population (Fig. 8B). The prevalence of poly-T types varied globally, with T8 and T14 being the dominant species. Intriguingly, T8 was markedly prevalent in the Vietnam pfmsp2 population. The E3 region of pfmsp2 also demonstrated substantial genetic diversity in the global population. A range of insertion types with diverse sequences was evident in the global pfmsp2 population (Fig. 8C). The PKGKGEVQKPN insertion appeared in all populations, albeit at varying frequencies by country. Additionally, pfmsp2 sequences lacking insertion were present in the global population, excluding PNG.