In the present study, a total of 150 blood samples were collected from P. vivax malaria infected patients from the hospitals of twin cities of Islamabad and Rawalpindi. The blood samples were verified by microscopic examination to ensure the P. vivax parasites and exclude samples with anyother infections. The PCR for PvCSP and PvMSP-1 genes were carried out of 150 P. vivax isolates out of which 35 sequences of PvCSP gene and 30 sequences of PvMSP-1 gene were sequenced with PCR products ranging from 900 to 1100 bp for PvCSP gene and ~400 bp for partial sequence of PvMSP-1 gene.
Sequence analysis of PvCSP gene
Multiple sequence alignment of the translated nucleotide sequences was carried out for the analysis of polymorphisms in the pre-, post- and central repeats of the PvCSP gene. The top hits for PvCSP gene were extracted from GenBank protein database using Blastp and one of the sequences of PvCSP gene of Iranian isolate was retrieved and used as reference sequence (KT588208.1). The multiple sequence alignment of extracted amino acid sequences were performed using ClustalW. When compared with the reference sequence (KT588208.1), the sequence analysis of PvCSP gene showed the VK210 and VK247 variant types infection. PvCSP gene sequence analysis revealed that majority (94%; 141/150) of the P. vivax isolates were of VK210 variant and only 9 isolates were found to be VK247 type. All PvCSP gene-based P. vivax variants started with the same pre-repeat sequence (KLKQP region). In the central-repeat region (CRR), the VK210 sequences comprised of variable repeats of PRMs, GDRADGQPA (PRM1), GDRAAGQPA (PRM2) which were found in all the isolates. It was followed through two conserved post-repeat sequence GNGAGGQAA (PRM3) and GGNAANK (PRM4) and one post-repeat insert i.e., KAEDA region. The one-copy repeat region of GGNA was found after the CRR in all the analyzed sequences. The frequency of peptide repeat motifs (PRMs) in the central repeat region (CRR) of PvCSP has been summarized in Figure 2. The observed non-synonymous substitution based on diverse types of repetition in allotypes (RATs), which leads to different PRM,s are mentioned in Table 1.
PvCSP CRR based genetic population structure
The population genetic structure based on the PvCSP CRR of the P. vivax isolates was analyzed and compared with PvCSP isolates of neighboring countries Iran, India and Myanmar. The haplotype (gene) diversity of PvCSP was categorized into fifteen distinct haplotypes with an estimated Hd of 0.547 and ten distinct haplotypes with an estimated Hd of 0.345 in Pakistani and Iran PvCSP samples respectively. Adding to this, Tajima’s D, Fu and Li’s D* and F* tests also accepted occurrence of a neutral model of polymorphism with values for these tests are given in Table 2 for the PvCSP variants from Pakistan, Iran, India and Myanmar. The overall nucleotide and haplotype diversity were 0.02371± 0.00056 and 0.084±0.00701, respectively. These results suggested that the CRR region of PvCSP population of Pakistan was under positive natural selection. Further, the effect of natural selection was explicated by the Tajima’s D which was 0.54276 (P > 0.10). The Fu and Li’s D and F values for CRR region was also positive. The nucleotide diversity and natural selection were also analyzed in PvCSP population of Iran, India and Myanmar. PvCSP population from India and Iran showed high nucleotide diversity but values from Myanmar of PvCSP population were negative, suggesting negative selection. The values of the Tajima’s D, Fu and Li’s D and F values for CRR region was also positive for PvCSP population of Iran and India as shown in Table 2.
Phylogenetic analysis of PvCSP gene
A phylogenetic tree drawn from the sequence findings of PvCSP gene is presented in Figure 3. Two separate clades can be inferred from the tree; one having VK210 variant type while the other has VK247 variant type of PvCSP isolates. Four sub-clusters of VK210 and VK247 can be distinguished in the leading clade. The associated taxa were clustered together and shown after the branches with the branch length as of the evolutionary distances used to calculate the phylogenetic tree. The evolutionary distance was computed by the p-distance method and analyzed using 55 nucleotide sequences. All the positions that had gaps or missing data were discarded. VK210 strain sequences from Pakistan showed 48- 100% identity with PvCSP sequences from countries such as Iran, Greece, India, USA, Sri Lanka, Australia, Vanuatu and Myanmar, whereas the sequences of VK247 strains from Pakistan showed 100% identity with sequences from Iran, Columbia, Vanuatu, USA, India, Latin America and Korea.
Sequence analysis of PvMSP-1 gene
The top hits for PvMSP-1 gene were extracted from GenBank protein database using Blastp and one of the sequences of PvMSP-1 gene of Iranian isolate was retrieved and used as reference sequence (KX697612.1). Sequence of PvMSP-1 gene was compared with reference sequence KX697612.1 of P. vivax strain. It revealed that PvMSP-1 gene sequences of 30 isolates were corresponding to partial sequence of other PvMSP-1 gene sequence at N-terminal. Overall, 13 single nucleotide polymorphisms (SNPs) were found amongst 30 samples with an average π value of 0.00143 in PvMSP-1 gene. The average conserved sequence between Pakistani and reference Iranian PvMSP-1 gene was C: 0.835 indicating that sequences have remained relatively unchanged with close evolutionary relationship. Overall genetic polymorphisms of the PvMSP-1 population were analyzed as shown in Figure 4. The N-terminal non-repeat region of the PvMSP-1 was well-conserved, although low frequencies of uneven amino acid changes were identified. The significant variation was observed from amino acid position K55N to M78T/N showing uneven and low frequencies with less conserved sequence of PvMSP-1.
PvMSP-1 N-terminal based genetic population structure
Population genetic structure based on the N-terminal of PvMSP-1 gene of the P. vivax isolates was analyzed and compared with isolates of neighboring country Iran as shown in Table 2. The haplotype diversity of PvMSP-1 gene was comparable between two countries ranging from 0.962 to 0.954. Adding to this, Tajima’s D, Fu and Li’s D* and F* tests also accepted occurrence of a neutral model of polymorphism with values for Fu and Li’s D and Fu and Li’s F are given in Table 2 for the PvMSP-1 variants from Pakistan and Iran. The results of PvMSP-1 population of Pakistan also indicated that positive natural selection may occur in the region. The overall nucleotide and haplotype diversity were 0.00162±0.0000026 and 0.012±0.00014 respectively. The effect of natural selection was estimated by the Tajima’s D which was 1.67790 (P > 0.10). The Fu and Li’s D and F values for PvMSP-1 population were also positive. The nucleotide diversity and natural selection were also analyzed in PvMSP-1 population of Iran and India which showed high nucleotide diversity with positive values of the Tajima’s D, Fu and Li’s D and Fu and Li’s F as shown in Table 2.
Phylogenetic analysis of PvMSP-1 gene
Based on sequence of PvMSP-1 gene, a phylogenetic tree was constructed (Figure 5). Two distinct clades can be inferred from the tree. The first is divided further into three sub-clades and contains Pakistani isolates and isolates belong to East Africa, Thailand, Mexico, India and USA with these isolates having 16 to 95% identity with sequences from Pakistani PvMSP-1 population. Second clade is further divided into two sub-clades having Turkey, Iran, Korea and Southern Mexico isolates in addition to sequences of Pakistani isolates with 62- 94% sequence identity.