Minor allele frequency distribution of Pfcsp
A total of 90 SNPs within Pfcsp were analysed for the minor allele frequency (MAF). The P. falciparum population from Navrongo showed more variability in Pfcsp (55 SNPs) than the Cape Coast population (35 SNPs) (Fig. 2). The allele frequency distribution of all putative SNPs within the Pfcsp loci ranged from 0.001-0.45 in Navrongo and 0.001-0.40 in Cape Coast (Fig. 2). As expected for natural P. falciparum populations in Africa (high transmission settings), the allele frequency spectrum was dominated by very-low-frequency alleles (MAF ≤ 0.05) in both populations. Rare alleles (MAF ≤ 0.01) were observed at frequencies of 62.9% (22/35) and 61.8% (34/55) in Cape Coast and Navrongo, respectively. A total of 20% (7/35) and 10.9% (6/55) low-frequency variants (MAF range = [0.01-0.05]) were observed in Cape Coast and Navrongo, respectively. However, the remaining alleles showed a moderate to high MAF in both populations, implying some underlying evolutionary events.
Within-host genetic diversity of Pfcsp
To assess the within-host diversity of Pfcsp in the population, the inbreeding coefficient (FWS) was investigated. Isolates with Fws values ≥0.95 were considered single strain (or monoclonal) infections, while Fws <0.95 indicated diverse multigene infections. In Cape Coast, 71.7% of Pfcsp isolates (66/92) came from single-strain infections with high inbreeding potential, while 28.3% (26/92) came from highly diverse multistrain infections with high potential for outcrossing (Fig. 3). For P. falciparum infections from Navrongo, 50.8% (65/128) were monoclonal Pfcsp isolates, and 49.2% (63/128) harboured multiple Pfcsp strains (Fig. 3). The Navrongo Pfcsp isolates exhibited significantly higher within-host diversity than those from Cape Coast (c2 = 15.382, p = 0.00009).
Genetic diversity of Pfcsp C-terminal haplotypes
To assess the extent of genetic diversity and similarity within and between the two populations, the diversity in the C-terminal region of Pfcsp (231 bp) was investigated from a total of 440 DNA sequences from Cape Coast (n = 184) and Navrongo (n = 256) (Table 1) and summarised in a Templeton, Crandall, and Sing (TCS) network (Fig. 4).
Table 1: Diversity indices of the Pfcsp C-terminal region of samples included in the network analysis
Population
|
n
|
Calculated indices1
|
|
|
h
|
S
|
K
|
|
Hd
|
Cape Coast
|
184
|
15
|
8
|
1.15
|
0.005 0.0004
|
0.718 0.026
|
Navrongo
|
256
|
53
|
16
|
3.76
|
0.016 0.0007
|
0.925 0.009
|
1Note; n= number of sequences, h = number of unique haplotypes, S= number of segregating sites, K = average number of pairwise nucleotide differences, π = nucleotide diversity, Hd = haplotype diversity
In total, 66 haplotypes were observed among the 440 Pfcsp sequences obtained from both populations (Fig. 4). Among these haplotypes, 15 and 53 were found in the Cape Coast and Navrongo populations, respectively. The RTS,S vaccine haplotype (Pf3D7-type) and 1 nonvaccine haplotype (denoted as “Hap 10”) were found in both populations (Fig. 4). The Pf3D7-type haplotype represented only 5.9% (n= 15/256) of haplotypes in Navrongo but 45.7% (n= 84/184) in Cape Coast (see Additional file 2). Only a single sample exhibited Hap 10 (0.4%), but this haplotype represented 6.0% of the total haplotypes in Cape Coast (11/184 isolates) (Additional file 2). While the Pf3D7-type haplotype was the most prevalent Pfcsp C-terminal haplotype (45.7%) in isolates from Cape Coast, the most frequent haplotype in the Navrongo isolates was “Hap 16”, representing 20.3% (52/256) of the haplotypes detected (Additional file 2).
According to the analysed genetic diversity indices, the Pfcsp C-termini of the Navrongo isolates were generally more diverse than those from Cape Coast (Table 1). In summary, more nucleotide polymorphisms (K= 3.761) and segregating sites (S =16) were observed in Navrongo than in Cape Coast (K = 1.148, S = 8). Consequently, Pfcsp nucleotide diversity ( was higher in the Navrongo isolates ( 0.0007) than in the isolates from Cape Coast ( = 0.005 0.0004). Haplotype diversity was also higher in Navrongo (Hd = 0.925 0.009) in comparison with Cape Coast (0.718 0.026) parasite isolates.
In total, 66 haplotypes were observed among the 440 Pfcsp sequences obtained from both populations (Fig. 4). Among these haplotypes, 15 and 53 were found in the Cape Coast and Navrongo populations, respectively. The RTS,S vaccine haplotype (Pf3D7-type) and 1 nonvaccine haplotype (denoted as “Hap 10”) were found in both populations (Fig. 4). The Pf3D7-type haplotype represented only 5.9% (n= 15/256) of haplotypes in Navrongo but 45.7% (n= 84/184) in Cape Coast (see Additional file 2). Only a single sample exhibited Hap 10 (0.4%), but this haplotype represented 6.0% of the total haplotypes in Cape Coast (11/184 isolates) (Additional file 2). While the Pf3D7-type haplotype was the most prevalent Pfcsp C-terminal haplotype (45.7%) in isolates from Cape Coast, the most frequent haplotype in the Navrongo isolates was “Hap 16”, representing 20.3% (52/256) of the haplotypes detected (Additional file 2).
According to the analysed genetic diversity indices, the Pfcsp C-termini of the Navrongo isolates were generally more diverse than those from Cape Coast (Table 1). In summary, more nucleotide polymorphisms (K= 3.761) and segregating sites (S =16) were observed in Navrongo than in Cape Coast (K = 1.148, S = 8). Consequently, Pfcsp nucleotide diversity ( was higher in the Navrongo isolates ( 0.0007) than in the isolates from Cape Coast ( = 0.005 0.0004). Haplotype diversity was also higher in Navrongo (Hd = 0.925 0.009) in comparison with Cape Coast (0.718 0.026) parasite isolates.
TH2R and TH3R amino acid haplotype diversity
The TH2R and TH3R sites were more polymorphic in both populations than the remaining amino acid sequence in the C-terminal region of PfCSP. In general, non-synonymous mutations predominated in all the isolates in both TH2R and TH3R epitope regions, with implications for cross-protection. Among the 92 (184 amino acid haplotypes) and 128 (256 amino acid haplotypes) isolates from Cape Coast and Navrongo, there were 8 and 27 nonvaccine TH2R haplotypes, respectively (see Additional file 3). There were also 2 and 10 nonvaccine TH3R haplotypes in Cape Coast and Navrongo, respectively, with 1 nonvaccine haplotype (NKPKDELNYAND) being shared between the two populations (Additional file 3). The frequencies of the Pf3D7-type TH2R vaccine haplotype (PSDKHIKEYLNKIQNSL) were 56.5% and 7.4% in Cape Coast and Navrongo, respectively (Fig. 5A), while the frequencies were 79.3% and 18.4% for the Pf3D7-type TH3R vaccine haplotype (NKPKDELDYAND) (Fig. 5B) in the Cape Coast and Navrongo isolates, respectively. The amino acid differences observed between Pf3D7 reference (3D7 0304600.1, PlasmoDB) and the Ghanaian isolates ranged from 1 - 6 in both epitope regions (see Additional file 3).
Population differentiation and structure of Pfcsp
The overall Weir and Cockerham’s Fst between the Cape Coast and Navrongo Pfcsp populations was <0.05 (Fig. 6A), which indicates minimal population differentiation due to genetic structure and suggests gene flow between the populations, despite the geographic distance between the sites. This also confirms the lack of genetic structure observed between Cape Coast and Navrongo parasite isolates through principal component analysis (Fig. 6B).
Evidence of selection within populations
Tajima’s D values were greater than zero in the TH2R and TH3R epitope regions of the C-terminal loci of Pfcsp (221,422-221,583) for the population of monoclonal Pfcsp isolates from Navrongo (Fig. 7A), suggesting balancing selection. However, a Tajima’s D < 0 was seen in the Cape Coast population at these loci, suggesting likely directional selection or clonal expansion in the population. Alleles at SNP locus 221554, which is within the segment encoding the TH2R epitope, had an |iHS| >3 in the Navrongo population, suggesting recent positive selection (Fig. 7C). The extended haplotype homozygosity revealed some extended haplotypes from the focal SNP locus 221554 in the Navrongo population, but no long-range haplotypes extended beyond 221554 (Figures 8A and 8B).
TH2R and TH3R amino acid haplotype diversity
The TH2R and TH3R sites were more polymorphic in both populations than the remaining amino acid sequence in the C-terminal region of PfCSP. In general, non-synonymous mutations predominated in all the isolates in both TH2R and TH3R epitope regions, with implications for cross-protection. Among the 92 (184 amino acid haplotypes) and 128 (256 amino acid haplotypes) isolates from Cape Coast and Navrongo, there were 8 and 27 nonvaccine TH2R haplotypes, respectively (see Additional file 3). There were also 2 and 10 nonvaccine TH3R haplotypes in Cape Coast and Navrongo, respectively, with 1 nonvaccine haplotype (NKPKDELNYAND) being shared between the two populations (Additional file 3). The frequencies of the Pf3D7-type TH2R vaccine haplotype (PSDKHIKEYLNKIQNSL) were 56.5% and 7.4% in Cape Coast and Navrongo, respectively (Fig. 5A), while the frequencies were 79.3% and 18.4% for the Pf3D7-type TH3R vaccine haplotype (NKPKDELDYAND) (Fig. 5B) in the Cape Coast and Navrongo isolates, respectively. The amino acid differences observed between Pf3D7 reference (3D7 0304600.1, PlasmoDB) and the Ghanaian isolates ranged from 1 - 6 in both epitope regions (see Additional file 3).
Population differentiation and structure of Pfcsp
The overall Weir and Cockerham’s Fst between the Cape Coast and Navrongo Pfcsp populations was <0.05 (Fig. 6A), which indicates minimal population differentiation due to genetic structure and suggests gene flow between the populations, despite the geographic distance between the sites. This also confirms the lack of genetic structure observed between Cape Coast and Navrongo parasite isolates through principal component analysis (Fig. 6B).
Evidence of selection within populations
Tajima’s D values were greater than zero in the TH2R and TH3R epitope regions of the C-terminal loci of Pfcsp (221,422-221,583) for the population of monoclonal Pfcsp isolates from Navrongo (Fig. 7A), suggesting balancing selection. However, a Tajima’s D < 0 was seen in the Cape Coast population at these loci, suggesting likely directional selection or clonal expansion in the population. Alleles at SNP locus 221554, which is within the segment encoding the TH2R epitope, had an |iHS| >3 in the Navrongo population, suggesting recent positive selection (Fig. 7C). The extended haplotype homozygosity revealed some extended haplotypes from the focal SNP locus 221554 in the Navrongo population, but no long-range haplotypes extended beyond 221554 (Figures 8A and 8B).