In the present study, a total of 150 blood samples were collected from P. vivax malaria infected patients from the hospitals of twin cities of Islamabad and Rawalpindi. The blood samples were verified by microscopic examination to ensure the P. vivax parasites and excluded the samples for presence of mixed Plasmodium species infections. The PCR product ranges from 900 to 1100 bp for pvcsp gene and ~400 bp for partial sequence of block 2 for pvmsp-1 gene.
Sequence analysis of pvcsp gene
Multiple sequence alignment of the translated nucleotide sequences was carried out for the analysis of polymorphisms in the pre-, post- and central repeats of the pvcsp gene. The top hits for pvcsp gene were extracted from GenBank protein database using Blastp and one of the sequences of pvcsp gene of Iranian isolate was retrieved and used as reference sequence (KT588208.1). The multiple sequence alignment of extracted amino acid sequences was performed using ClustalW. When compared with the reference sequence (KT588208.1), the sequence analysis of pvcsp gene showed the VK210 and VK247 variant types infection. The pvcsp gene sequence analysis revealed that majority (92%; 32/35) of the P. vivax isolates were of VK210 variant and only 3 isolates were found to be vk47 type. All the pvcsp gene-based P. vivax variants started with the same pre-repeat sequence (KLKQP region I). In the central-repeat region (CRR), the VK210 sequences comprised of variable repeats of PRMs, GDRADGQPA (PRM1), GDRAAGQPA (PRM2) which were found in all the isolates. It was followed through two conserved post-repeat sequence GNGAGGQAA (PRM3) and GGNAANK (PRM4) and one post-repeat insert i.e., KAEDA region. The one-copy repeat region of GGNA was found after the CRR in all the analyzed sequences. The frequency of peptide repeat motifs (PRMs) in the central repeat region (CRR) of pvcsp has been summarized in Figure 2. The observed non-synonymous substitution based on diverse types of repetition in allotypes (RATs), which leads to different PRMs are mentioned in Table 1.
pvcsp CRR based genetic population structure
The population genetic structure based on the pvcsp CRR of the P. vivax isolates was analyzed and compared with pvcsp isolates of neighboring countries Iran, India and Myanmar. The haplotype (gene) diversity of pvcsp was categorized into fifteen distinct haplotypes with an estimated Hd of 0.547 and ten distinct haplotypes with an estimated Hd of 0.345 in Pakistani and Iranian pvcsp samples respectively. The values for Tajima’s D, Fu and Li’s D* and F* tests are given in Table 2 for the pvcsp variants from Pakistan, Iran, India and Myanmar. The Fu and Li’s D* and F* values for CRR region was also positive suggested that the CRR region of pvcsp population of Pakistan was under positive natural selection. The nucleotide diversity in pvcsp population of Pakistan, Iran, India and Myanmar were highly significant (P < 0.05) as compared to haplotype diversity which was significant (P < 0.05) in Pakistan and Iran and non-significant (P > 0.05) in India and Myanmar. The pvcsp population from India and Iran showed high nucleotide diversity but values from Myanmar of pvcsp population were negative, suggesting negative selection. The values of the Tajima’s D, Fu and Li’s D* and F* values for CRR region was also positive for pvcsp population of Iran and India as shown in Table 2.
Table 1. Nucleotide sequence of four repeated allotypes (RATs) and the peptide repeat motif (PRMs) in the central-repeat region of pvcsp gene
PRMs
|
Nucleotide sequence of the repeat allotypes (RATs)
|
GDRADGQPA (PRM1)
|
GGAGACAGAGCAGATGGACAGCCAGCA
GGAGACAGAGCAGATGGACAGCCAGCA
GGTGATAGAGCAGCTGGACAACCAGCA
GGTGATAGAGCAGATGGACAGCCAGCA
GGCGATAGAGCAGCTGGACAGCCAGCA
GGCGATAGAGCAGATGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGCGATAGAGCAGATGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGCGATAGAGCAGATGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAACCAGCA
GGTGATAGAGCAGCTGGACAACCAGCA
GGAGATAGAGCAGATGGACAACCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGAGATAGAGCAGCTGGACAGCCAGCA
GGAAATGGTGCAGGTGGACAGGCAGCA
GGAGGAAATGCGGCAAACAAG
|
GDRAAGQPA (PRM2)
|
GNGAGGQAA (PRM3)
|
GGNAANK (PRM4)
|
Table 2. Estimates of nucleotide, haplotype-diversity and DNA sequence polymorphisms of CRR region of P. vivax pvcsp and block 2 of pvmsp-1 genes in Pakistan
Population
|
Gene analyzed
|
No. of sequences
|
Fragment studied
|
Size of the fragment
(a.a)#
|
No. of Haplotypes
|
Diversity ± SD
|
Fu & Li’s D*p value
|
Fu & Li’s F* p value
|
Tajima’s D* p value
|
Nucleotide Haplotype
|
Pakistan
Iran
Myanmar
India
|
pvcsp
pvcsp
pvcsp
pvcsp
|
35
28
15
25
|
CRR
CRR
CRR
CRR
|
364
360
354
324
|
15
10
7
17
|
0.02371± 0.00056 0.084±0.00701
0.02001± 0.00031 0.057± 0.00478
0.01781 ±0.0008 0.056±0.015
0.0370±0.0064 0.681±0.076
|
1.17870
1.01980
-1.20965
1.05422
|
1.12083
1.80767
-1.06781
1.02237
|
0.54276
0.43556
-0.78645
0.35674
|
Pakistan
Iran
India
|
pvmsp-1
pvmsp-1
pvmsp-1
|
30
32
25
|
block 2
block 2
block 2
|
143
151
155
|
10
10
07
|
0.00162±0.0000026 0.012±0.00014
0.00159±0.0000023 0.012±0.00014
0.0212 ± 0.0005 0.989 ± 0.010
|
1.86276
1.78902
1.55433
|
2.13897
1.90878
1.744009
|
1.67790
1.56845
1.66792
|
*P > 0.10
# number of amino acids
Phylogenetic analysis of pvcsp gene
A phylogenetic tree drawn from the sequence findings of pvcsp gene is presented in Figure 3. Two separate clades can be inferred from the tree; one having VK210 variant type while the other has VK247 variant type of pvcsp gene. Four sub-clusters of VK210 and VK247 can be distinguished in the leading clade. The associated taxa were clustered together and shown after the branches with the branch length as of the evolutionary distances used to calculate the phylogenetic tree. The evolutionary distance was computed by the p-distance method and analyzed using 55 nucleotide sequences. All the positions that had gaps or missing data were discarded. VK210 strain sequences from Pakistan showed 54% identity with pvcsp sequences from countries such as Iran, Greece, India, USA, Sri Lanka, Australia, Vanuatu and 100% with Myanmar, whereas the sequences of VK247 strains from Pakistan showed 100% identity with pvcsp sequences from Iran, Columbia, Vanuatu, USA, India, Latin America and Korea.
Sequence analysis of pvmsp-1 gene
The top hits for pvmsp-1 gene were extracted from GenBank protein database using Blastp and one of the sequences of pvmsp-1 gene of Iranian isolate was retrieved and used as reference sequence (KX697612.1). Sequence of pvmsp-1 gene was compared with reference sequence KX697612.1 of Iranian P. vivax strain. It revealed that pvmsp-1 gene sequences of 30 samples were corresponding to partial sequence of block 2 of pvmsp-1 gene. Overall, 13 single nucleotide polymorphisms (SNPs) were found amongst 30 sequences with an average π value of 0.00143 in block 2 of pvmsp-1 gene. The average conserved sequence between Pakistani and reference Iranian pvmsp-1 gene was C: 0.835 indicating that sequences have remained relatively unchanged with close evolutionary relationship. Overall genetic polymorphisms of the pvmsp-1 population were analyzed as shown in Figure 4. The low frequencies of uneven amino acid changes were identified at block 2 of the pvmsp-1 gene. The significant variation was observed from amino acid position K55N to M78T/N showing uneven and low frequencies with less conserved sequence of pvmsp-1.
pvmsp-1 N-terminal based genetic population structure
Population genetic structure based on the N-terminal of pvmsp-1 gene of the P. vivax isolates was analyzed and compared with isolates of neighboring country Iran as shown in Table 2. The haplotype diversity of pvmsp-1 gene was comparable between two countries ranging from 0.962 to 0.954. Adding to this, Tajima’s D, Fu and Li’s D* and F* tests also accepted occurrence of a neutral model of polymorphism with values for Fu and Li’s D* and Fu and Li’s F* are given in Table 2 for the pvmsp-1 variants from Pakistan and Iran. The Fu and Li’s D* and F* values for pvmsp-1 population were also positive. The results of pvmsp-1 population of Pakistan also indicated that positive natural selection may occur in the region. The overall nucleotide and haplotype diversity were 0.00162±0.0000026 and 0.012±0.00014 respectively. The nucleotide diversity in pvcsp population of Pakistan, Iran and India were highly significant (P < 0.05) as compared to haplotype diversity which was significant (P < 0.05) in Pakistan and Iran. The effect of natural selection was estimated by the Tajima’s D which was 1.67790 (P > 0.10).
Phylogenetic analysis of pvmsp-1 gene
Based on sequence of pvmsp-1 gene, a phylogenetic tree was constructed (Figure 5). Two distinct clades can be inferred from the tree. The first is divided further into three sub-clades and contains Pakistani isolates and isolates belong to East Africa, Thailand, Mexico, India and USA with these isolates having 16 to 95% identity with sequences from Pakistani pvmsp-1 population. Second clade is further divided into two sub-clades having Turkey, Iran, Korea and Southern Mexico isolates in addition to sequences of Pakistani isolates with 62- 94% sequence identity.