Polymorphism in coding region of human CYP2D6 gene contributes to failed radical cure of vivax malaria using primaquine

Background: Recent studies found that damaged enzyme activity of cytochrome P450 isoenzyme 2D6 (CYP2D6) can lead to compromised efficacy of primaquine in killing Plasmodium during liver dormancy and consequently relapsed malaria. Therefore, WHO has listed the decrease enzyme activity of CYP2D6 as one of the four scenarios that are not suitable for primaquine therapy. Based on analysis of CYP2D6 gene polymorphisms in relapsed vivax malaria patients, this study preliminarily revealed the genetic association between human CYP2D6 genotype and the declined efficacy of primaquine treatment in vivax malaria. Methods: Blood samples of vivax malaria cases treated with "chloroquine/primaquine eight-day therapy" from 2014 to 2018 in Yunnan Province were collected, and the vivax malaria relapsed cases after clinical curing were determined by identifying the infected Plasmodium isolates through gene traceability analysis. Human genomic DNA was extracted from blood samples, and 9 exons of CYP2D6 gene were amplified by PCR and then sequenced. The CDS of CYP2D6 gene was obtained by alignment and splicing with the wild-type reference sequence. The mutation types of coding DNA sequences and their association with vivax malaria relapse were analyzed, and the activity phenotype of CYP2D6 isozyme was predicted by its allelic form. Results: One hundred and fifty-six blood samples from 75relapsed cases of vivax malaria and the same number of 75 non-relapsed vivax malaria cases were collected for PCR amplification. Two amplification products (2411bp and 2388bp) containing exon1-4 and exon5-9 of CYP2D6 gene were obtained from every sample. The 1491-bp-length CDS chains of CYP2D6 gene were obtained from those 150 samples, and were defined as 24 haplotypes (Hap_1~Hap_24).17 haplotypes were determined from the sequences of vivax malaria relapsed patients and 15 haplotypes were from those of non-relapsed patients.Hap_6 sequencing showed G>C, and mutant heterozygotes 0.5, and 1.25, it indicates that CYP2D6 isozyme activity has the four phenotypes of PM, IM, NM and UM, respectively.

DNA sequences from non-relapsed cases only showed mutation heterozygote (C/T, 2/2) at c.886 locus. The allelic form of relapsed patients was CYP2D6 *2, and diploid score was used to indirectly predict the CYP2D6 enzyme activity of Hap_6, and the relapsed patients belong to normal metabolizer (NM).

Conclusion:
Among the numerous mutations of CYP2D6 gene, the joint mutations at three loci (c.408, c.886, c.1457) are most closely related to the damagedCYP2D6 enzyme activity. Whether c.886 locus mutation plays a critical role contributing to such damage warrants further verification with expanded sample size.

Background
In recent years, the scope of global malaria epidemic has gradually narrowed down, yet the death tolls of malaria deaths remain high, and the incidence of malaria has been increasing in certain areas [1]. Technological limitations may pose daunting challenges to the elimination of malaria, such as the difficulty in identifying low-density Plasmodium infection [2], the population expansion of drug resistant Plasmodium [3][4], and the lack of antimalarial drug reserves [5]. No local cases of malaria have been reported in China since 2016 [6]. However, Yunnan Province, as a traditional malaria endemic area in China, still bears the burden of hundreds of cases of malaria imported from abroad every year. From 2011 to 2018, 50.7%of the imported malaria cases in Yunnan province accounted for P. vivax [7]. Therefore, the effective eradication of P. vivax calls for eliminating the source of infection and blocking its transmission in Yunnan.
So far, primaquine is the only drug that has been reported effective to kill P. vivax hypnozoite. WHO recommended the use of primaquine as the only anti-relapsing drug for treating malaria [8]. However, as the active component that kills P. vivax hypnozoite, 5-hydroxy-primaquine requires the catalytic effect of CYP2D6 enzyme in human hepatocytes [9][10][11]. CYP2D6, also known as debrisoquine4hydroxylase, is an isozyme belonging to the CYP450superfamily and an important phase I drug metabolizing enzyme in the human body. Its catalytic functional group a protein containing ferroheme-mercaptide. Moreover, CYP2D6 is also the only enzyme in the P450 family that cannot be induced. It features low capacity and high affinity. CYP2D6 only accounts for 4% of the total P450 enzyme protein in the liver, guiding the metabolism of ~ 30% of commonly used clinical drugs, including primaquine [11][12][13].However, studies have found that the heterogeneity of CYP2D6 activity is mostly controlled by its genetic variation of various degrees [14], and that the damaged CYP2D6 isoenzyme activity caused by genetic polymorphism of CYP2D6 gene has become an adverse factor aggravating the risk of primaquine use. The undermined CYP2D6 enzyme activity would obstruct the generation of 5-hydroxy-primaquine, making large doses or repeated use of primaquine a common remedy for the declined efficacy of primaquine in the treatment. Meanwhile, patients with G6PD deficiency may experience dual hazards of primaquine exposure [8,[15][16]. The more dismal reality is that geographic distribution of damaged CYP2D6 enzyme activity is highly overlapped with G6PD deficiency, which is in the endemic area of P. vivax [17][18]. Hence, it is necessary to investigate not only the physical type of G6PD but also the activity of CYP2D6 isoenzyme in the treatment of vivax malaria [15,18].WHO defines that G6PD deficiency, impaired CYP2D6 function, pregnant or lactating women, and infants are not suitable for the use of primaquine [11,15].
The measurement methods of CYP2D6 enzyme activity include phenotype observation and genotype analysis prediction. Compared with phenotype measurement, the prediction method based on CYP2D6 genotype has the following advantages [19][20]. Firstly, genotypes does not change in one's life, and is not affected by environment or biological factors [20][21]. Secondly, genotyping only requires one specimen and can be completed before medication treatment of the patients and therefor has superior prospective [22]. In addition, human CYP2D6 genes can be easily sequenced to determine the combinations of CYP2D6 alleles [23][24].Therefore, it is more stable and convenient for batch operation to predict the phenotype of CYP2D6 enzyme activity by genotyping [19,[25][26][27][28][29].The prediction is based on the locus status of CYP2D6 gene [14], or the allelic combination form of zygote [30][31]. The identified genotypes are no more than alleles with complete function, reduced function or loss of function. The indicators of CYP2D6 enzyme activity include Poor metabolizer (PM), Intermediate metabolizer (IM), Normal metabolizer (NM) and Ultra-rapid metabolizer (UM), which cover various levels of CYP2D6 enzymes from ultra-fast acting to non-metabolic activity [29,31].
However, the number of alleles variants in CYP2D6 gene has reached 63 excluding a series of sub-variants (http://www.cypalleles.ki.se/) [29]. Genotype analysis and prediction method calls for further precision and comprehensive optimization [32].
Gene mutations in enzymes related to drug metabolism not only affect drug efficacy, but can even lead to toxic effect of drug reversal. Therefore, understanding the specific CYP2D6 enzyme activity in different individuals [15] is the prerequisite to ensure radical treatment of vivax malaria with personalized primaquine dose [29,33]. However, despite the frequent use of primaquine in malaria areas in Yunnan, the exposure risk of primaquine pertaining to G6PD gene has not been fully understood, nor has the polymorphic investigation into CYP2D6 mutation been conducted, nor has the appropriate population for safe use of primaquine been clarified. To improve gene prediction method of CYP2D6 and provide safety cure of P. vivax by using primaquine at the molecular level, this study intends to investigate the genetic relationship between human CYP2D6 genotype and the declined efficacy of primaquine in the radical cure of P. vivax.

Subjects and blood sample
Blood samples of vivax malaria cases diagnosed by Yunnan Province (double parasitology diagnosis of light microscopy and Plasmodium gene detection) and reported from 2014 and 2018 were collected.0.6ml venous blood filter sample was used for gene analysis. The cases were divided into two groups: relapsed and non-relapsed. The relapsed cases consisted of patients who underwent "chloroquine/primaquine eight-day therapy" yet relapsed, and that their genetic tracing of pvcsp gene showed that the infection was caused by the same Plasmodium isolates. Non-relapsed cases were randomly selected from vivax malaria cases from 2014 to 2018. The two groups had equal number of cases. Malaria infection sites were identified by epidemiological investigation. Informed consent was obtained from all the subjects for collecting blood samples and its purposes.

Extraction of human genomic DNA
Three filter paper blood membranes with a diameter of 5 mm were used and human genomic DNA was extracted using QIAgen Mini Kit kit (Germany, QIAamp Company's DNA Mini Kit), according to the manufacture's instruction, and was stored at -20 ° C prior to use.

PCR amplification of CYP2D6 gene fragment
The PCR amplification primers of CYP2D6 gene were designed by using GenBank (https://www.ncbi.nlm.nih.gov/gene/), reference sequence (ID NC_000022.11) was used as template, following the reaction conditions in previous references [26,34], The primers for first-round nested For all the PCR reaction systems, we used 2.6μl DNA template, 14.0μl 2 × Taq PCR hybrid system (QIAGEN, Germany), 0.7μl upstream primer, 0.7μl downstream primers (20umol / L). The total volume was made up to 25.0μl with ddH 2 O. PCR reaction conditions were clarified as follows: 92 to 94 ° C for 2-5 mins; 92 to 94 ° C for 10-30s, 50 to 56 ° C for 15-30s, 68 to 72 ° C for 2 to 3 min, 35 cycles; 68 to 72 ° C for 7 mins. Each fragment was subjected to 3 amplifications in parallel. The amplified products were observed using 1.5% agarose gel electrophoresis. The positive products were sent to Shanghai Meiji Biomedical Technology Co, Ltd. for bi-directional Sanger sequencing. At least two identical sequences were used for subsequent analysis.

Mutation analysis of CYP2D6 coding region
The sequencing results were trimmed using DNAStar v5.10 and BioEdit v7.0.9.0 software. The obtained DNA sequence was compared with wild-type CYP2D6 gene (GenBank: NC_000022.11) by sequence alignment to confirm exons 1-9 in the DNA sequence ofCYP2D6 gene. The DNA sequences of these 9 exons were spliced from exon 1 to exon 9 into the CDS of CYP2D6 gene. MEGA v5.04 software was used to align the CDS strand with the wild-type sequence to confirm the loci of missense mutation and synonymous mutation. DnaSP v6.11.01 software was used to identify missense mutation and synonymous mutation locus of CDS chain, and the parameters of DNA diversity index (π) and expected heterozygosity (He) were calculated [35]. The allele form of CYP2D6 gene coding region was determined by reference [26][27][28].
SPSS software (version 21.0; IBN; Chicago; IL) was used to conduct chi-square test on the frequency of haplotype co-existing in relapsed and non-relapsed patients. The OR value of haplotypes with statistical significance and the relapse rate of P. vivax were calculated. Significant level was set at α = 0.05. Value ˂ 1 indicates protective factor and > 1 means risk factor.

Prediction of CYP2D6 enzyme activity
The mutation loci of CYP2D6 gene were used to directly predict the enzymatic activity as one of four phenotypes, including PM, IM, NM and UM in accordance with previously introduced methods [19][20].The allelic form of the coding region of CYP2D6 gene was identified according to the methods described by Gaedigk et al [21], Sistonen et al [26] and Zanger et al [28]. Among them, genotype * 1 is wild-type allele; * 2 has base substitution at c. 408, c. 886 and c. 1457. Nevertheless, both two are of the same functional normal type allele. * 10 and * 41 are the allele of decreased enzyme activity caused by C > T and G >A base substitution at c.100 and c.1023+36, respectively.
Subsequently, we referred to previously described methods [29,32] to assign the activity score of the allelic form of CYP2D6 gene. The score of wild-type allele, c. 100 alleles and intron c. 1023+36 alleles at each locus were 1.0, 0.25 and 0.5 respectively. The wild homozygotes of the two loci were 2.0, and the scores of mutant homozygotes and mutant heterozygotes were 0.5, 1.0 and 1.25, 1.5, respectively. In the event of multiple loci in the genotype, the enzyme activity score is calculated as the sum of the scores of the zygotes included. When the diploid score x falls within the interval of 0.0,0 < x < 1. 25

Sample information and PCR amplification of human CYP2D6 gene
A total of 75 relapsed cases of P. vivax were enrolled. Among them 71 cases, 2 cases and 2 cases had one, two and three relapse events, respectively. The basic information of 75 non-relapsed cases is listed in Table 1. The mean age of the 150 cases was 33 years, male-female ratio was 2.66:1, and the proportions of vivax infections in Myanmar, Africa, Laos and Yunnan were 96.7% (145/150), 1.3% (2/150), 0.7% (1/150) and 1.3% (2/150), respectively.

Locus polymorphism of CDS chain
The PCR amplification products of CYP2D6 gene from the relapsed cases and non-relapsed cases were trimmed, and the CDS chains containing complete exon1-9 (total length = 1491bp) were obtained from all samples. A total of 150 CDS chains were selected from each of the 75 relapsed case and 75 non-relapsed cases, and were compared with wild-type sequence (NC:000022.11). Base replacement at 12 loci, including c.31 and c100 was found ( Table 2). The proportion of the third and the first base replacement of the codon triplet was 41.7% (5/12), and the proportion of the second base replacement was 16.6% (2/12). 7 missense mutation loci and 5 synonymous mutation loci were determined. 91.7% (11/12) mutated loci were found in the sequence of relapsed cases, and 66.7% (8/12) mutated loci were determined in the sequence of non-relapsed cases ( Table 2).   The predicted amino acid changes are indicated below. No synonymous mutation was found. R represented Relapsed cases, while N indicates Non-relapsed cases.

Haplotypes of CYP2D6 gene coding region and their association with relapse
The odds ratio of genotype to relapse was calculated by selecting six haplotype sequences with sequence number > 1 that simultaneously existed in relapsed patients and non-relapsed patients and could be used for analysis (

Enzyme activity prediction of significant haplotypes and their diploids
Hap_6 and Hap_4 were verified to exert effect on relapse based on the prediction of enzyme activity.
Among them, there were 12 CDS chains in Hap_6 (Table 4) (Table 4). Two genotypes of *1/*2 and *1/*41 were found in the nonrelapsed cases. By contrast, *1/*41 was also determined in the relapsed cases and reached the genotype score of 1.5. It was predicted to be at "NM" level in terms of enzyme activity, along with the *1/*2 genotype showing mutant heterozygous (C/T) only at c.886 and its genotype score reaching the full score of 2.0 (Table 4).

NM a Intronic position c.1023+36Intron 2988has been considered as defining allele *41 because the predictability of this mutation for impaired enzyme function has been demonstrated recently
The 13 CDS chains belonging to Hap_4 was found in 2 relapsed and 11 non-relapsed patients ( Table   4). In the diploid of relapsed cases, only one genotype was found, which is *1/*10-2. Its diploid expression at the three scoring loci c.100, c.408 and c.1457 showed mutant heterozygosity (C/T), mutant homozygosity (C/C) and mutant homozygosity (C/C), respectively. Such pattern resembled the genotype of 54.6% (6/11) of non-relapsed cases, and reached the genotype score of 1.25. The CYP2D6 enzyme activity based on this score was predicted at "NM" level ( Table 4). The remaining 5 non-relapsed cases were classified as two genotypes of *10/*10 and *1/*10-1. The diploids of genotypes *10/*10 at the three scoring loci of c.100, c.408 and c.1457 were all mutant homozygous (T/T), (C/C) and (C/C), with the lowest genotype score being 0.5. In 3 cases of *1/*10-1 genotype, mutation heterozygosity (C/T) was found at the c.100 scoring locus, the genotype score was 1.25, and its CYP2D6 enzyme activity was predicted as "NM" level (Table 4).

Discussion
The 4383-bp-length CYP2D6 gene is located on chromosome 22q13.1 and consists of 8 ~ 9 exons and 7 ~ 8 introns. The length of its CDS is 1338 ~ 1491 bp in length, and encodes 446 and 497 amino acids. The heterogeneity of CYP2D6 monooxygenase activity from complete dysfunction to ultra-rapid enzyme metabolism can be attributed to certain genotypes of CYP2D6. Among the 150 identified alleles, multiple mutations, such as CYP2D6*3, CYP2D6*4, CYP2D6*5 and CYP2D6*6, are considered to be associated with compromised CYP2D6 enzyme activity [34,36].
To our knowledge, the current study is the first one that attempts to analyze mutation polymorphism in the whole gene coding region of human CYP2D6 gene, thereby verifying its possible associated with impaired CYP2D6 enzyme activity in patients with relapsed vivax malaria events after primaquine treatment in Yunnan Province. Among the 150 CDS chains of CYP2D6 genes, 12 base substitutions were detected and were identified as 7 missense mutation loci and 5 synonymous mutation loci. Among them, 91.7% (11/12) mutation locus was found in the sequence of relapsed patients, and 66.7% (8/12) mutation locus was found in the sequence of non-relapsed patients.
Among the 24 haplotypes defined, Hap_2 makes up the largest proportion of the sequences, reaching as high as 36.6% (55/150). The haplotype has multiple mutations at four loci, including c.100C > T, c.336C > T, c.408G > C, c.1457G > C. These mutations are disparate from previous studies which reported that CYP2D6*10 allele are easy to detected in Asian populations [34,37].CYP2D6*10 allele has multiple mutations at c. 100C > T, c. 408G > C and c. 1457G > C, which are less at c. 336C > T than Hap_2. This gap might be resulted from the fact that the sequencing method of the whole gene coding region is easier to detect all base substitutions. However, whether the differences in common alleles of CYP2D6 are worthy of in-depth research is subject to different populations. The population studied in this paper is not random, but those who reside in disease endemic areas and showed 50% chances of recurrent vivax malaria patient (75/150). Thusly, the genotypes of such population may show the screening outcome of human defective genes by malaria epidemic in an accurate manner [38].
Although it is speculated that the enzyme activity damage in the heterogeneity of CYP2D6 may be related to the failed therapy of primaquine to kill hypnozoite of P. vivax [8,18,[39][40][41], the correlation between different genotypes and the relapse of P. vivax is poorly understood. In this study, we found that the genotypes with combined mutations at c. 408, c. 886 and c. 1457 were most closely related to the relapse of vivax malaria (OR = 5.615, P < 0.05) ( Table 3). Mutations at c. 408 and c. 1457, and the individuals with homozygous mutations at c. 886 had relapse phenotype (Table 4). Only 1/2 of mutant heterozygotes indicate relapse, suggesting that among the genotypes with multiple locus mutations such as c. 408, c. 886 and c. 1457, the key to prevent vivax malaria relapse may lie in mutation at c. 886 locus. Such findings is not consistent with that of Silvino et al [41], who believes that c. 100, c. 321 and other locus mutations are more closely related to vivax malaria relapse. Such disparity might be attributed to the difference between the two methods used for detecting gene mutation loci. Silvino  At present, the main methods for predicting CYP2D6 enzyme activity based on gene information include theCYP2D6-locus-mutation-polymorphism-based direct prediction method summarized by Zanger et al [28] and indirect prediction method of genotype score recommended by Gaedigk et al [29] and Caudle et al [32].Because the latter presents the advantage of quantifying the degree of gene mutation and more abundant genotypic criteria, we used this method to predict and analyze the activity of CYP2D6 enzymes in this study. However, regardless of Hap_6 indicating risk status or Hap_4 serving as protective factors, all enzyme activity phenotyping were predicted to be above IM grade. The allele for most of the relapsed patients was CYP2D6*2, and multi-locus genotypes had scores of greater than 1.5. By comparison, most of the non-relapsed patients had CYP2D6*10 allele, and their multi-locus genotypes had a score of less than 1.5. Except two cases belonging to IM phenotype, the others belonged to NM phenotype. This result is not in line with that of Brasil et al [40], which reported that most of the alleles in relapsed patients are CYP2D6*10, CYP2D6*39 and so forth. Moreover, it contradicts the clinical observation of relapsed patients in this study. Such conflict might be justified by the reason that CYP2D6 enzyme activity prediction method used in this study to obtain whole gene coding region information takes into the account of more alleles and thereby leads to enhanced genotype score. In contrast, Brasil et al. detected only a few mutation loci and therefore could miss out the reverse effects of other locus mutations.CYP2D6*2 allele showing heterogeneous phenotypes with normal and abnormal enzyme activity [43] suggests that identifying heterogeneous CYP2D6 enzyme activity with similar genotypes is the key [21,29], no matter what strategy is used to predict CYP2D6 enzyme activity. Meanwhile, the prediction method needs to be further optimized.
This study predicted the unconventional phenotype of CYP2D6 enzyme activity in patients with relapsed vivax malaria on the basis of genetic information, and therefore did not rule out the relapsed vivax malaria cases which acquired primaquine resistance after treatment. Chen et al. [44] observed the resistance of Plasmodium to primaquine in relapsed vivax malaria cases in the Australian Defense Force, therefore it is essential to eliminate the influence of interfering factors in genetic studies. This is another aspect that deserves improved prediction method to analyze CYP2D6 enzyme activity.
In this study, although only Sanger sequencing method was used to analyze the sequence polymorphism in the whole coding region of CYP2D6 gene, the strategy of three parallel repeats was adopted for each PCR reaction and the sequencing of its products. Such effort helped eliminate the error rate in single PCR direct sequencing to a certain extent. Meanwhile, the cost-effective sequencing analysis method is more suitable for molecular epidemiological cross-sectional investigation of CYP2D6 gene mutation polymorphism in the endemic areas where primaquine is used to treat vivax malaria. However, there are still some shortcomings in this study. Firstly, the sample size is not large enough, and the low-frequency alleles of CYP2D6 genes detected in the study should be further verified. Secondly, in the study of genetic association between CYP2D6 genotypes and relapsed P. vivax, although the identification of relapsed cases had achieved the genetic origin of infected strains, it did not rule out the confounding factors such as the possible resistance of Plasmodium to primaquine, hence the identification of relapsed vivax malaria patients may have been enlarged. Finally, although the CYP2D6 enzyme activity was analyzed by using the method of zygotic allelic to assign the activity score recently recommended by Caudle et al [32], this study came to define more CYP2D6 gene locus polymorphisms. Since there is no benchmark standard to assign the activity scores to these loci polymorphisms, the predicted CYP2D6 enzyme activity in this study may not be completely homogenous with the findings of Caudle et al.

Conclusion
In conclusion, the CYP2D6 enzyme activity prediction method based on the sequencing of whole gene (2) The key to predict CYP2D6 enzyme activity by genotype is to define the genotype accurately and accordingly, to study the genetic correlation between genotype and enzyme activity phenotype with reasonable sample size. Because the whole gene sequencing method can fully identify the mutation loci, it is promising as a molecular epidemiological investigation method for CYP2D6 gene mutation polymorphism. Declarations Yuxi, Chuxiong, Honghe, Zhaotong, Diqing, Qujing, and Whenshan.

Authors' contributions
HH was responsible for the gene testing, study design, statistics and analysis of the data, and wrote the manuscript; YDong performed coordination of all project, study design, analysis of the data and the manuscript guidance and revision. SL participated in the study design; YX, YL, YDeng, MC, CZ performed the collection blood samples and microscopy examination. All authors read and approved the final draft of the manuscript.

Availability of data and materials
Not applicable.

Ethics approval and consent to participate
The study was approved by Yunnan Institute of Parasitic Diseases and by the Ethical Committee.
Genetic testing experiment, etc. were performed on stored blood samples obtained as part of routine diagnostic work-up patients with fever suspected of malaria. Although the absence of risk and the anonymous data processing are ensured among the study, consent from potential malaria patients need to be obtained during collecting blood samples. Demographic, clinical and epidemiological information of each fever patient will be collected. Database access will be restricted by password, and Yunnan Institute Parasitic Diseases will not allow retrieving and saving the personal identification information into the project database. It is committed not to provide information about the patient to any person unrelated to the study.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.