Genetic Polymorphism of human CYP2D6 gene coding region results in failed radical cure of vivax malaria using primaquine

Background: Primaquine is the traditional medicine could effectively eliminate the hypnozoites of P. vivax, and therefore has been recommended by World Health Organization (WHO) as the anti-relapsing drug for the treatment of vivax malaria patient. However, 5-hydroxy-primaquine, the active component in primaquine to kill P. vivax hypnozoite has to be catalyzed by Cytochrome P450, family 2, subfamily D, polypeptide 6 (CYP2D6) enzyme in human hepatocytes. Nevertheless, suppressed CYP2D6 enzyme activity caused by CYP2D6 gene mutation has been considered as a risk factor for the ecacy of primaquine. Based on the analysis of CYP2D6 gene polymorphism in relapsed vivax malaria patients, the current study aims to preliminarily reveal the genetic association between human CYP2D6 genotype and the curative effect of primaquine in vivax malaria. Methods: Blood samples of vivax malaria cases received "chloroquine/primaquine eight-day therapy" from 2014 to 2018 in Yunnan Province were collected. The relapsed cases were determined by identifying the homogeneity of P. vivax circumsporozoite protein (pvcsp) gene sequences between the Plasmodiums isolated from patients’ infected blood samples. Human genomic DNA was extracted from the blood samples, and 9 exons of CYP2D6 gene were amplied by polymerase chain reaction (PCR) and were then sequenced. The coding DNA sequence (CDS) of CYP2D6 gene was obtained by splicing and aligning with the wild-type reference sequence. The mutation types of CDS and their association with vivax malaria relapse were analyzed, and the activity phenotype of CYP2D6 isozyme was predicted by the allelic form of CYP2D6 gene. Results: A total of 156 blood samples from 75 relapsed cases of vivax malaria and 75 non-relapsed cases were collected for nested-PCR amplication. Two nested-PCR products, the CYP2D6 gene fragments containing exon1-4


Background
In recent years, the global malaria epidemic has gradually scaled down, yet the death tolls of malaria remain high, and the number of malaria cases in certain areas has rebounded [1]. Meanwhile, multiple technological limitations pose daunting challenges to the elimination of malaria, such as the di culty in identifying low-density Plasmodium infection [2], the population expansion of drug resistant Plasmodium [3][4], and the lack of variety in antimalarial drug [5]. Although Chinese local cases (without a travel history to malaria epidemic areas outside China within the last 30 days before the onset of malaria) of malaria have not been reported since 2016 [6], China still need to cope with a large number of imported malaria cases every year. Located in southwest China bordering Myanmar, Laos and other Southeast Asian countries ( Fig.1), Yunnan Province had experienced large-scale malaria epidemic and was once the hardest-hit area of malaria. As long as Africa and Southeast Asia are the main sources of imported malaria cases in China [7], Yunnan Province is still facing a great risk of imported malaria, especially vivax malaria. From 2011 to 2018, 50.7% of the imported malaria cases in Yunnan Province are caused by P. vivax [8]. Therefore, the effective elimination of malaria in Yunnan Province calls for the strict enforcement to block the transmission of P. vivax. Primaquine is the traditional medication proved to be effective in eradicating the hypnozoites of P. vivax, and the World Health Organization (WHO) recommended it as the available antimalarial drug for the treatment of relapsed malaria (except for the new drug Tafenoquine) [9]. However, the active component in primaquine for killing P. vivax hypnozoite, 5-hydroxy-primaquine, requires the catalysis of Cytochrome P450, family 2, subfamily D, polypeptide 6 (CYP2D6) enzyme in human hepatocytes [10][11][12]. CYP2D6, also known as debrisoquine4-hydroxylase, is an isozyme belonging to the Cytochrome P450 (CYP450) superfamily and an important phase I drug-metabolizing enzyme in the human system. CYP2D6 is the only enzyme in the CYP450 superfamily that cannot be created through inducer in the environment and it has low capacity and high a nity. Even though CYP2D6 only accounts for 4% of the total P450 enzyme protein in the liver, it dominates the metabolism of ~30% commonly used drugs, including primaquine [12]. Studies have found that the heterogeneity of CYP2D6 activity is predominantly governed by its genetic variation [13], and that decreased CYP2D6 isoenzyme activity caused by genetic polymorphism would obstruct the generation of 5-hydroxy-primaquine, making large doses or repeated use of primaquine a common remedy to compensate for the declined e cacy of primaquine in the treatment.
Meanwhile, primaquine can cause life threatening hemolysis in patients with Glucose-6-Phosphate Dehydrogenase (G6PD) de ciency, subsequently in icting dual hazards, including low e cacy of antirelapse treatment of primaquine and the continuously hemolysis, on the patients [9,[14][15]. Owing to the dismal fact that the distribution of patients with defective CYP2D6 is highly coincident with those with G6PD-de ciency in the endemic area of vivax malaria [16][17], the patient's G6PD gene type and CYP2D6 enzyme activity should be identi ed by care workers prior to receiving anti-relapse treatment of vivax malaria [14,17]. WHO de nes that G6PD de ciency, impaired CYP2D6 function, pregnant or lactating women, and young infants are not suitable for primaquine treatment [12,14].
CYP2D6 enzyme activity is measured mainly through phenotype observation and genotype analysis prediction. Compared with phenotype measurement, prediction method based on CYP2D6 genotype is characterized by the following advantages [18][19]. Firstly, genotypes do not change throughout one's life, and are not affected by environment or physical factors [19][20]. Secondly, genotyping only requires one specimen and can be completed before medication treatment [21]. In addition, human CYP2D6 genes can be easily sequenced to determine the mutation types of CYP2D6 alleles [22][23]. Therefore, the use of genotyping method can create more stable results in predicting the phenotype of CYP2D6 enzyme activity, and easy-to-handle for related research [18,[24][25][26][27][28]. The prediction is based on the locus status of CYP2D6 gene [13], or the allelic combination form of zygote [29][30]. The identi ed genotypes consist of alleles with complete function, reduced function or loss of function; the indications of CYP2D6 enzyme activity include Poor metabolizer (PM), Intermediate metabolizer (IM), Normal metabolizer (NM) and Ultrarapid metabolizer (UM), covering various active levels of CYP2D6 enzymes from ultra-fast to nonmetabolic [28,30]. However, more than 80 CYP2D6 alleles have been de ned by the Cytochrome P450 Nomenclature Committee at http://www.cypalleles.ki.se [23]. The prediction of CYP2D6 enzyme activity by performing genotype analysis needs to be further optimized to enhance accuracy and systematicness [31].
Gene mutations of enzymes related to drug metabolism not only affect the clinical consequences of drug metabolism, but can even lead to individual adverse event [13]. Therefore, understanding the speci c CYP2D6 enzyme activity in different individual is the prerequisite to ensure radical treatment of vivax malaria based on personalized primaquine dose [28]. Despite the frequent use of primaquine in malaria endemic areas of Yunnan, no systematic survey about genetic polymorphisms of CYP2D6 has been conducted. Therefore, the appropriate population for primaquine treatment is yet to be clari ed. In order to identify the unsuitable population for primaquine use and to understand the CYP2D6 enzyme activity level of the population in malaria endemic areas of Yunnan Province. This study intends to investigate the genetic association between human CYP2D6 genotype and declined e cacy of primaquine in the radical cure of P. vivax.

Materials And Methods
Subjects and blood sample Blood samples of diagnosed and reported vivax malaria cases were consecutively collected in Yunnan Province from 2014 to 2018. Samples used for reexamination were provided by the Centers for Disease Control and Prevention in Yunnan Province. The mono-infection of P. vivax requires dual parasitically con rmation by both microscopy examination and Plasmodium 18S rRNA gene detection (PCR) by Yunnan Province Reference Laboratory (YNRL). Dried blood spots samples for genetic analysis were derived from 0.6 ml of venous blood, which were then put in a drying tube and stored at -80 ℃ before DNA extraction. The cases were divided into two groups: relapsed group and non-relapsed group. All vivax malaria patients spent three days in the hospital and underwent three-days oral Chloroquine therapy (total 1550 mg), followed by an 8-day course of Primaquine (22.5 mg/day). The relapsed cases consisted of patients who relapsed after receiving above-mentioned treatment, and their infection was caused by the same Plasmodium in two samples detected by sequencing of P. vivax circumsporozoite protein (pvcsp). A total of 75 relapsed cases were con rmed in Yunnan Province from 2014 to 2018. Nonrelapsed cases were randomly selected from vivax malaria cases of 2018. To avoid bias caused by different sample size, the two groups had equal number of cases, and the total number of experimental samples was set as 150. The infection sources of vivax malaria cases were con rmed according to epidemiological investigation: those without a travel history to epidemic areas outside Yunnan Province within the last 30 days before the onset of malaria were de ned as local cases; those have history of travelling to epidemic regions, such as Myanmar and Africa, were regarded as imported cases.
Extraction of human genomic DNA Three dried blood spots (diameter = 5 mm) were used, and human genomic DNA was extracted by using QIAgen Mini Kit kit (Germany, QIAamp Company's DNA Mini Kit), according to the manufacture's instruction. The extracted DNA was stored at -20 ℃ for the subsequent assay.

PCR ampli cation of CYP2D6 gene fragment
The primers of CYP2D6 for polymerase chain reaction (PCR) ampli cation were designed by using GenBank (https://www.ncbi.nlm.nih.gov/gene/), reference sequence (ID NC_000022.11) was used as template, following the reaction condition in previous studies [25,32]. The primers for rst-round nested PCR covering the coding region of CYP2D6 gene exons1~4 were: 5'-CCAGTGACAGATAAGGGTGC-3' and 5'-GACGTGGATAGGAGGTACAGAG-3'; the primers for second-round nested PCR were: 5'-GGTGACTTCTCCGACCAGG-3' and 5'-TTCCCAAACCCATCTATGC-3'. The nal ampli cation region was 42131088 ~ 42128678, and the expected fragment length of the product was 2411 bp. The primers that cover exons 5~9 of CYP2D6 gene were: 5'-GCCGACTGAGCCCTGGGAGGTAGGTA-3' and 5'-GCTGGGGCCTGAGACTT-3. The ampli cation area was 42126035~42128422, and the ampli cation products had an expected fragment length of 2388 bp. For all the PCR reaction systems, we used 2.6μl DNA template, 14.0μl 2 × Taq PCR hybrid system (QIAGEN, Germany), 0.7μl upstream primer (20umol / L), 0.7μl downstream primers (20umol / L). The total volume was adjusted to 25.0μl with ddH 2 O. PCR reaction conditions were clari ed as follows: 92 to 94 ° C for 2-5 mins; 92 to 94 ° C for 10-30s, 50 to 56 ° C for 15-30s, 68 to 72 ° C for 2 to 3 min, 35 cycles; 68 to 72 ° C for 7 mins. The triplicated parallel repetition was adopted for each PCR reaction. The ampli ed products were observed using 1.5% agarose gel electrophoresis. The positive ampli cation products were sent to Shanghai Meiji Biomedical Technology Co, Ltd. for sequencing by using the dideoxy chain-termination method. Only the sequences that showed identical results in at least two tests were used for subsequent analysis.

Mutation analysis of CYP2D6 coding region
The sequencing results were aligned by using DNAStar v5.10 and BioEdit v7.0.9.0 software. The obtained all DNA sequences were assessed by using the Basic Local Alignment Search Tool (BLAST, http://blast.ncbi.nlm.nih.gov/Blast.cgi) in the NCBI platform. Sequences with identi cations equals 100% and Query cover above 99% were considered as CYP2D6 gene sequence of human. The DNA sequence of exon 1~9 region of CYP2D6 gene was determined and spliced into CYP2D6 CDS in the order from exon1 to exon9. MEGA v5.04 software was used to align the CDS of each case with the wild-type sequence to con rm the loci of missense mutation and synonymous mutation. DnaSP v6.11.01 software was used to identify missense mutation and synonymous mutation locus in each CDS, and the parameters of DNA diversity index (π) and expected heterozygosity (He) were calculated. The allele form of CYP2D6 gene coding region was determined by using the method described in previous studies [25][26][27].
SPSS software (version 21.0; IBN; Chicago; IL) was used to conduct chi-square test on the frequency of haplotype co-existed in relapsed and non-relapsed patients. The odds ratio (OR) value of haplotypes with statistical signi cance were calculated by comparing to the relapse rate of P. vivax. The signi cance level was set as P < 0.05; OR 1 indicates that the haplotype is a protective factor, otherwise a risk factor.

Prediction of CYP2D6 enzyme activity
The mutation loci (base substitution loci) of CYP2D6 gene were used to directly predict the enzymatic activity as one of the following four phenotypes, PM, IM, NM or UM. The allelic form of the coding region of CYP2D6 gene was identi ed according to the methods described by Gaedigk et al [20], Sistonen et al [25] and Zanger et al [27]. Among them, CYP2D6* 1 is wild-type allele; CYP2D6* 2 showed base substitutions at c. 408, c. 886 and c. 1457. Nevertheless, both two have the same functional normal allele type. CYP2D6 * 10 and CYP2D6* 41 are the alleles of decreased enzyme activity caused by C > T and G >A base substitution at c.100 and c.1023+36, respectively.
Subsequently, we assigned the activity score of the allelic form of CYP2D6 gene according to previously described methods [28,31]. It was found that the mutations at c. 408, c. 886 and c. 1457 are not implicated in impaired enzyme activity, therefore these three loci cannot be included in the activity score system. By comparison, the wild type alleles of c.100 and c.1023+36 had the assigned score of 1.0. The mutant alleles of c.100 and intron c.1023+36 scored 0.25 and 0.5, respectively. For the event of multiple loci in the genotype, the enzyme activity score was calculated as the sum of the scores of the included zygotes. The intervals of the diploid score within 0.0; 0 < x < 1. 25

Results
Sample information and PCR ampli cation of human CYP2D6 gene A total of 75 relapsed cases of P. vivax were enrolled, including 71 cases of one relapsed event, 2 cases of two relapse events, and two case of three relapsed events. The geographic information of 75 nonrelapsed cases is listed in Table 1. The male-female ratio was 2.66:1, and the proportions of cases imported from Myanmar, Africa countries, Laos and Yunnan were 96.7% (145/150), 1.3% (2/150), 0.7% (1/150) and 1.3% (2/150), respectively. The odds ratio of relapsed was calculated in different age groups and genders. Statistical signi cance was determined in age groups of 5-20 years (OR = 0.380; 95% CI: 0.986~0.146) and 21-60 years (OR = 2.471:95% CI: 5.417~1.127). Thusly, the risk of relapsed of vivax malaria in patients aged 5-20 years was signi cantly lower, whereas the risk of relapsed in the patients aged 21-60 was 2.471 times higher than that of other groups. No signi cant correlation between other age groups or gender and the relapsed of vivax malaria was determined.  The ampli cation products showed clear bands at 2000bp, which were considered as the target bands (Fig 2.). The PCR ampli cation products of CYP2D6 gene from the relapsed cases and non-relapsed cases were trimmed, and the CDS chains containing the complete exon1-9 (total length = 1491bp) were obtained from all samples. A total of 150 CDS chains (Genbank accession number: MT339044-MT339193) were selected from the 75 relapsed case and 75 non-relapsed cases, and were then compared with wild-type sequence (NC: 000022.11). Base substitutions at 12 loci, such as c.31 and c.100, were determined ( Table  2). The proportions of third-base and rst-base substitution in the codon triplet were 41.7% (5/12), and the proportion of second-base substitution was 16.6% (2/12). 7 missense mutation loci and 5 synonymous mutation loci were determined. The mutation loci of the relapsed cases accounted for 91.7% (11/12), whereas the mutation loci of non-relapsed cases accounted for 66.7% (8/12) ( Table 2)

Haplotypes of CYP2D6 gene coding region and their association with relapse
We selected the Hap_2 ~ Hap_6 haplotype sequences which had more than one CDS and were found in relapsed and non-relapsed patients to calculate the odds ratio (Table3  The enzyme activity of Hap_6 and Hap_4 genotypes, which were validated to implicate in recurrence of vivax malaria, was predicted. As for the 12 CDS chains in Hap_6 (Table 4) Table 4). The genotypes of *1/*2 and *1/*41 were found in the non-relapsed cases. However, *1/*41 was also determined in the relapsed cases and reached 1.5 in genotype score. Hence its enzyme activity was predicted as "NM" level. Moreover, *1/*2 genotype which showed mutant heterozygous (C/T) only at c.886 reached the peak genotype score of 2.0 and was therefore predicted as "NM" level (Table 4).
To our knowledge, the current study is the rst one that attempts to analyze mutation polymorphism in the whole coding region of human CYP2D6 gene, thereby verifying its possible association with thwarted CYP2D6 enzyme activity in relapsed vivax malaria patients after primaquine treatment. Among the 24 haplotypes de ned, Hap_2 makes up the largest proportion of the sequences, reaching as high as 36.6% (55/150). This haplotype shows multiple mutations at four loci, including c.100C > T, c.336C > T, c.408G > C, c.1457G > C. These mutations are different from previous studies which reported that CYP2D6*10 allele is frequently detected in Asian populations [32,34]. CYP2D6*10 allele has multiple mutations at c. 100(C > T), c. 408(G > C) and c. 1457(G > C), yet is devoid of c. 336C > T mutation as opposed to Hap_2. Such disparity might be attributed to fact that the sequencing of the whole gene coding region can easily detect all base substitutions. However, the factor of different population should be taken into account to determine whether such differences in common alleles of CYP2D6 gene are worthy of in-depth research.
The population studied in this paper is not randomly selected, but is composed of vivax malaria patients residing in endemic areas, and 50% of them (75/150) had relapsed malaria. Thusly, the genotypes of such population may show the accurately selective outcome of human defective genes after the screening of malaria epidemic.
Although decreased enzyme activity in the heterogeneity of CYP2D6 is speculated to lead to failed therapy of primaquine to kill hypnozoite of P. vivax [9,17,[35][36][37], the correlation between different genotypes and the relapse of P. vivax is poorly understood. In this study, we found that the genotypes with combined mutations at c. 408, c. 886 and c. 1457 were most closely related to the relapse of vivax malaria (OR = 5.615, P < 0.05) ( unique phenomenon of reduced risk of vivax malaria relapse (OR = 0.159, P < 0.05), and such ndings warrants further analysis and con rmation.
To date, direct prediction method based on CYP2D6 locus mutation polymorphism, which was summarized by Zanger et al [27] and indirect prediction method of genotype scoring, which was recommended by Gaedigk et al [28] and Caudle et al [31], are applied to predict CYP2D6 enzyme activity based on genetic information. Since the genotype scoring method can quantify the degree of gene mutation and abides by more abundant genotypic criteria, this method was used in the study. However, no matter Hap_6 indicates high risk of relapse or Hap_4 functions as the protective factors, we predicted that all the genotypes are above IM level. CYP2D6*2 is the allele type that indicates highest risk of relapse, and most of the non-relapsed patients had CYP2D6*10 allele. This result is not in line with that of Brasil et al [36], which reported that most of the alleles in relapsed patients are CYP2D6*10, CYP2D6*39 and so forth. Moreover, it contradicts the clinical observation of relapsed patients in this study. Such con ict of outcome might be justi ed that the prediction method of CYP2D6 enzyme activity used in this study focuses on analyzing the whole gene coding region information, thereby taking more alleles into account and leading to enhanced genotype score. In contrast, Brasil et al., detected only a few mutation loci and therefore could miss out the reverse effects of other locus mutations. CYP2D6*2 allele shows heterogeneous phenotypes exhibiting both normal and abnormal enzyme activity [39]. Therefore, identifying the heterogeneous CYP2D6 with similar genotypes is the key to predicate the enzyme activity of CYP2D6 [20,28], regardless of the method used for detection. Meanwhile, the prediction method should be further optimized, because the effect of primaquine resistance in patients with relapsed vivax malaria is not eliminated.
Although this study applied rst-generation sequencing method to analyze polymorphism in the coding region of CYP2D6 gene, the strategy of triplicated parallel repetition was adopted for each PCR reaction and the sequencings of its products. Such effort helped eliminate the rate of error in single PCR sequencing to a certain extent. Meanwhile, the cost-effective sequencing analysis method is more suitable for molecular epidemiological cross-sectional investigation of CYP2D6 gene mutation in the endemic areas where primaquine is used to intervene vivax malaria. However, this study has some limitations. Firstly, the sample size is not large enough, and the low-frequency alleles of CYP2D6 genes detected in the study should be further veri ed. Secondly, all the non-relapsed patients were selected from 2018, which is not equivalent to the relapsed patients enrolled for four consecutive years. Sampling bias may affect the experimental results due to changes of various factors over the years. In addition, other confounding factors which may amplify the identi cation of relapsed vivax malaria patients, such as the existence of primaquine-resisted Plasmodium, are not ruled out, despite the identi cation of relapsed cases has been achieved by comparing the genetic origin of infected strains. Finally, CYP2D6 enzyme activity was analyzed by assigning the activity score of zygotic allelic, a method recommends by Caudle et al [31], yet this study de ned more CYP2D6 gene locus polymorphisms. Since there is no benchmark standard to scores the activity of these loci polymorphisms, the prediction of CYP2D6 enzyme activity in this study may not be completely homogenous with the ndings of Caudle et al.

Conclusion
In conclusion, the prediction of CYP2D6 enzyme activity based on whole gene coding region sequencing preliminarily reveals CYP2D6 gene mutation polymorphisms in relapsed P. vivax patients who showed poor curative effect of primaquine in Yunnan. This study makes up for the analysis of genetic correlation factors in relapsed P. vivax patients in China, and reaches the following ndings: (1) Among the multiple mutations of CYP2D6 gene, combined mutations at three loci, including c.408, c.886 and c.1457, are most closely related to decreased CYP2D6 enzyme activity. Mutations at c.408 and c.1457 induce compromised CYP2D6 enzyme activity, yet a larger sample size is need to validate whether such decreased activity is aggravated by c.886 mutation; (2) The key to predict CYP2D6 enzyme activity is to de ne the genotype accurately and accordingly, based on the genetic correlation between genotype and enzyme activity phenotype with reasonable sample size. Because the whole gene sequencing method can fully identify the mutation loci, it is a promising molecular epidemiological method for investigating CYP2D6 gene mutation polymorphism. Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional a liations. Figure 1 1 The map of China highlighting the geographic location of Yunnan Province. Different colors represent different administrative areas, and green labels indicate the main source of infection in areas with many imported cases. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.