Genetic Polymorphism of human CYP2D6 gene coding region from vivax malaria cases in Yunnan Province, China

Background: Primaquine is one of two medications able to effectively eliminate the hypnozoites of P. vivax, and therefore has been recommended by World Health Organization (WHO) as the anti-relapsing drug for the treatment of vivax patients. However, 5-hydroxy-primaquine, the presumed active component, must be generated by CYP2D6 enzyme in human hepatocytes and reduced CYP2D6 enzyme activity caused by gene mutations has been considered cause of primaquine failure. Based on the analysis of CYP2D6 gene polymorphisms in suspected relapsed vivax malaria patients, the current study aims to determine the association between human CYP2D6 genotype and the ecacy of primaquine. Methods: Blood samples from vivax patients in Yunnan Province who received "chloroquine/primaquine eight-day therapy" and experienced suspected relapses were collected from 2014 to 2018. Human genomic DNA was extracted, and 9 exons of CYP2D6 gene were amplied by polymerase chain reaction (PCR) and were then sequenced. The mutation types of CDS and their association with CYP2D6 enzyme activity changes were analyzed. Results: A total of 156 blood samples from 75 relapsed cases of vivax malaria and 75 non-relapsed cases were collected and two nested-PCR products, the CYP2D6 gene fragments containing exon1-4 and exon5-9 (2411bp and 2388bp), were obtained from the sample. After splicing and combining these two amplication products, we found that the CDS of CYP2D6 gene in each sample was 1491bp in length. Moreover, we identied 24 haplotypes (Hap_1~Hap_24) in the Clustered CYP2D6 full-CDS of 150 patients, including 17 haplotypes in relapsed patients and 15 haplotypes in non-relapsed patients (8 haplotypes co-existed in two groups of patients). In addition, we identied 33 diplotypes (A_1~A_33) in 150 patients. The A_10 and A_12 diplotypes showed higher In conclusion, the prediction of CYP2D6 enzyme activity based on whole gene coding region sequencing preliminarily reveals CYP2D6 gene mutation polymorphisms in relapsed P. vivax patients who showed poor curative effect of primaquine in Yunnan. This study makes up for the analysis of genetic correlation factors in relapsed P. vivax patients in China, and reaches the following ndings: (1) Among the multiple mutations of CYP2D6 gene, combined mutations at three loci, including c.408, c.886 and c.1457, are most closely related to decreased CYP2D6 enzyme activity. Mutations at c.408 and c.1457 induce compromised CYP2D6 enzyme activity, yet a larger sample size is need to validate whether such decreased activity is aggravated by c.886 mutation; (2) The key to predict CYP2D6 enzyme activity is to dene the genotype accurately and accordingly, based on the genetic correlation between genotype and enzyme activity phenotype with reasonable sample size. Because the whole gene sequencing method can fully identify the mutation loci, it is a promising molecular epidemiological method for investigating CYP2D6 gene mutation polymorphism.


Background
In recent years, the global malaria epidemic has gradually scaled down, yet the death tolls of malaria remain high, and the number of malaria cases in certain areas has rebounded [1]. Meanwhile, multiple technological limitations pose daunting challenges to the elimination of malaria, such as the di culty in identifying low-density Plasmodium infection [2], the population expansion of drug resistant Plasmodium [3][4], and few new drugs in the antimalarial pipeline [5]. Although Chinese local cases of malaria (without a travel history to malaria epidemic areas outside China within the last 30 days) have not been reported since 2016 [6], China still needs to cope with a large number of imported malaria cases every year. Located in southwest China bordering Myanmar, Laos and other Southeast Asian countries (Fig.1), Yunnan Province has experienced large-scale malaria epidemic and was once the hardest-hit area of malaria [7].
Yunnan Province is still facing a great risk of imported malaria. And from 2011 to 2018, 50.7% of the imported malaria cases in Yunnan Province were caused by P. vivax [8]. Therefore, the elimination of malaria in Yunnan Province calls for the strict enforcement radical cure treatment to block the transmission of P. vivax. Primaquine has proven to be effective in eradicating the hypnozoites of P. vivax, and the World Health Organization (WHO) recommends it for the treatment of relapsed malaria (and since 2018, the newly approved tafenoquine) [9]. The Cytochrome P450, family 2, subfamily D, polypeptide 6 (CYP2D6) enzyme in human hepatocyte metabolizes primaquine into 5-hydroxy-primaquine, the component thought to be responsible for killing P. vivax hypnozoites [10][11][12]. CYP2D6, also known as debrisoquine4-hydroxylase, is an isozyme belonging to the Cytochrome P450 (CYP450) superfamily and an important phase I drugmetabolizing enzyme in the humans. CYP2D6 has low capacity and high a nity. Even though CYP2D6 only accounts for 4% of the total P450 enzyme protein in the liver, it is involved with the metabolism of ~30% of commonly used drugs, including primaquine [12]. Studies have found that the heterogeneity of CYP2D6 activity is predominantly governed by its genetic variation [13], and that decreased CYP2D6 isoenzyme activity caused by genetic polymorphisms would obstruct the generation of 5-hydroxy-primaquine, making large doses or repeated use of primaquine necessary to compensate for the declined e cacy of primaquine. Meanwhile, primaquine can cause life threatening hemolysis in patients with Glucose-6-Phosphate Dehydrogenase (G6PD) de ciency, subsequently in icting dual hazards, including low e cacy of anti-relapse and acute hemolysis, on the patients [9,[14][15]. So the patient's G6PD gene type should be identi ed by care workers prior to receiving anti-relapse treatment of vivax malaria and based on our and others' ndings, it should be considered to test for CYP2D6 genotype and predicted enzyme activity as well [14,[16][17].
CYP2D6 enzyme activity is measured mainly through phenotype observation and genotype analysis prediction. Compared with phenotype measurement, prediction method based on CYP2D6 genotype is characterized by the following advantages [18][19]. Firstly, genotypes do not change throughout one's life, and are not affected by environment or physical factors [19][20]. Secondly, genotyping only requires one specimen and can be completed before treatment [21].
In addition, using sequencing to determine the combination form of CYP2D6 alleles is a technique that can be easily obtained at present. Therefore, the use of genotyping method can create more stable results in predicting the phenotype of CYP2D6 enzyme activity [18,[22][23][24][25][26]. The prediction is based on the locus status of CYP2D6 gene [13], or the allelic combination form of diplotype [27][28]. However, more than 113 CYP2D6 alleles have been de ned by the Cytochrome P450 Nomenclature Committee at https://www.pharmvar.org/gene/CYP2D6 [23]. The prediction of CYP2D6 enzyme activity by performing genotype analysis needs to be further optimized to enhance accuracy and systematicness [29].
Gene mutations of enzymes related to drug metabolism not only affect the clinical consequences of drug metabolism, but can even lead to individual adverse event [13]. Therefore, understanding the speci c CYP2D6 genetic polymorphism in different individuals is the prerequisite to ensure radical treatment of vivax malaria based on personalized primaquine dose [24]. Despite the frequent use of primaquine in malaria endemic areas of Yunnan, no systematic survey about genetic polymorphisms of CYP2D6 has been conducted. Based on the investigation of CYP2D6 gene polymorphism, this study preliminary explores the association between human CYP2D6 genotype and impaired e cacy of primaquine in the radical cure of P. vivax in order to identify population who may be refractory to primaquine treatment in Yunnan Province.

Materials And Methods
Subjects and blood sample The subjects were diagnosed cases of P.vivax in Yunnan Province. Mono-infection of P. vivax was con rmed by both microscopy examination and Plasmodium 18S rRNA gene detection (PCR) by Yunnan Province Malaria Diagnosis Referent Laboratory (YPMDRL). All vivax malaria patients spent three days in the hospital and underwent three-days oral Chloroquine therapy (total 1550 mg), followed by an 8-day course of Primaquine (22.5 mg/day). And the eight days was directly observed therapy (DOT) by local Center for Disease Control (CDC) personnel to ensure compliance. The subjects were divided into two groups: (1) suspected relapsed group (SR group).The SR group consisted of patients who were re-diagnosed with vivax malaria within six months of follow-up after receiving above-mentioned treatment and who has not been exposed to malaria transmission again between two clinical malaria outbreaks through epidemiological investigation, and their infection was caused by the sequence similarity above 98% of Plasmodium in two samples detected by sequencing of P. vivax circumsporozoite protein (pvcsp). A total of 75 SR cases were con rmed in Yunnan Province from 2014 to 2018. (2) Non-relapsed group (NR group). Seventy-ve NR cases was selected by computer generated randomization from 143 cases of vivax malaria in 2018 who were followed up for one year and had no history of recurrence or signs. The infection sources of vivax cases were con rmed according to epidemiological investigation: those without a travel history to epidemic areas outside Yunnan Province within the last 30 days before the onset of malaria were de ned as local cases; those have history of travelling to epidemic regions, such as Myanmar and Africa, were regarded as imported cases.
The blood samples of all the subjects were provided by the Centers for Disease Control and Prevention in Yunnan Province. Dried blood spots samples for genetic analysis were derived from 0.6 ml of venous blood, which were then put in a drying tube and stored at -80 ℃ before DNA extraction.
Extraction of human genomic DNA Three dried blood spots (diameter = 5 mm) were made, and human genomic DNA was extracted by using QIAgen Mini Kit kit (Germany, QIAamp Company's DNA Mini Kit), according to the manufacture's instruction. The extracted DNA was stored at -20 ℃.

PCR ampli cation of CYP2D6 gene fragment
The primers of CYP2D6 for polymerase chain reaction (PCR) ampli cation were designed by using GenBank (https://www.ncbi.nlm.nih.gov/gene/), reference sequence (ID NC_000022.11) was used as template, following the reaction condition in previous studies [23,30]. The primers for rst-round nested PCR covering the coding region of CYP2D6 gene exons1~4 were: 5'-CCAGTGACAGATAAGGGTGC-3' and 5'-GACGTGGATAGGAGGTACAGAG-3'; the primers for second-round nested PCR were: 5'-GGTGACTTCTCCGACCAGG-3' and 5'-TTCCCAAACCCATCTATGC-3'. The nal ampli cation region was 42131088 4 2128678, and the expected fragment length of the product was 2411 bp. The primers that cover exons 5~9 of CYP2D6 gene were: 5'-GCCGACTGAGCCCTGGGAGGTAGGTA-3' and 5'-GCTGGGGCCTGAGACTT-3. The ampli cation area was 42126035~42128422, and the ampli cation products had an expected fragment length of 2388 bp. For all the PCR reaction systems, we used 2.6μl DNA template, 14.0μl 2 × Taq PCR hybrid system (QIAGEN, Germany), 0.7μl upstream primer (20umol / L), 0.7μl downstream primers (20umol / L). The total volume was adjusted to 25.0μl with ddH 2 O. PCR reaction conditions were clari ed as follows: 92 to 94 ° C for 2-5 mins; 92 to 94 ° C for 10-30s, 50 to 56 ° C for 15-30s, 68 to 72 ° C for 2 to 3 min, 35 cycles; 68 to 72 ° C for 7 mins. The triplicated parallel repetition was adopted for each PCR reaction. The ampli ed products were observed using 1.5% agarose gel electrophoresis. The positive ampli cation products were sent to Shanghai Meiji Biomedical Technology Co, Ltd. for sequencing by using the dideoxy chaintermination method. Only the sequences that showed identical results in at least two tests were used for subsequent analysis.

Mutation analysis of CYP2D6 coding region
The sequencing results were aligned by using DNAStar v5.10 and BioEdit v7.0.9.0 software. The obtained all DNA sequences were assessed by using the Basic Local Alignment Search Tool (BLAST, http://blast.ncbi.nlm.nih.gov/Blast.cgi) in the NCBI platform. Sequences with identi cations equals 100% and Query cover above 99% were considered as CYP2D6 gene sequence of human. The DNA sequence of exon 1~9 region of CYP2D6 gene was determined and spliced into CYP2D6 CDS in the order from exon1 to exon9. MEGA v5.04 software was used to align the CDS of each case with the wild-type sequence to con rm the loci of missense mutation and synonymous mutation. The de ned mutation loci are queried for the corresponding ID in Genbank. DnaSP v6.11.01 software was used to identify missense mutation and synonymous mutation locus in each CDS, and the parameters of DNA diversity index (π) and expected heterozygosity (He) were calculated. Determine the allelic forms of the mutation loci by checking the sequencing peak map, the diplotypes of 150 cases were determined and the frequency of each diplotype was calculated. SPSS software (version 21.0; IBN; Chicago; IL) was used to conduct chi-square test or Fisher's Exact Test on the frequency of haplotypes, diplotypes and mutation loci co-existed in SR and NR group, the inspection level was 0.05.

Results
Sample information and PCR ampli cation of human CYP2D6 gene A total of 75 SR cases of P. vivax were enrolled, including 71 cases of one suspected relapsed event, 2 cases of two suspected relapse events, and two case of three suspected relapsed events. The geographic information of 75 NR cases is listed in Table 1. The male-female ratio was 2.66:1, and the proportions of cases imported from Myanmar, Africa countries, Laos and Yunnan were 96.7% (145/150), 1.3% (2/150), 0.7% (1/150) and 1.3% (2/150), respectively. The odds ratio of relapsed was calculated in different age groups and genders. Statistical signi cance was determined in age groups of 5-20 years (OR = 0.380; 95% CI: 0.986~0.146) and 21-60 years (OR = 2.471:95% CI: 5.417~1.127). No signi cant correlation between other age groups or gender and the relapsed of vivax malaria was found. The PCR ampli cation products of CYP2D6 gene from the SR group and NR group were trimmed, and the CDS chains containing the complete exon1-9 (total length = 1491bp) were obtained from all samples. A total of 150 CDS chains (Genbank accession number: MT339044-MT339193) were selected from the 75 SR cases and 75 NR cases, and were then compared with wild-type sequence (NC: 000022.11). Base substitutions at 12 loci, such as c.31 and c.100, were determined ( Table 2). All the mutation loci were found in Genbank and the corresponding ID, was regarded as result normal, and no new mutation loci were found. The proportions of third-base and rst-base substitution in the codon triplet were 41.7% (5/12), and the proportion of second-base substitution was 16.6% (2/12). 7 missense mutation loci and 5 synonymous mutation loci were determined. Of these mutation loci, the SR cases accounted for 91.7% (11/12), whereas the mutation loci of NR cases accounted for 66.7% (8/12) ( Table 2).
Comparison of the two groups differences in mutation allele frequency at the 12 SNPs (Table 2). In the SR group, the mutation allele frequency was the highest in c.1457 G>C (80.7% 121/150), followed by c. 408 G>C (73.3% 110/150). In the NR group, on the contrary, the mutation allele frequency was the highest in c. 408 G>C (80.0% 120/150), followed by c. 1457 G>C (74.7% 112/150). Regarding the two mutation alleles, a non-signi cant trend (x 2 1.863 and x 2 1.557 P > 0.05) for difference in frequency distribution among groups SR and NR was detected. However, the frequency differences of the other two mutation alleles c. 100C>T and c. 886C>T between the two groups were statistically signi cant. The variant c. 100C>T allele showed lower frequency in SR group (x 2 9.039 P < 0.05). But the variant c. 886C>T allele showed higher frequency in SR group (x 2 8.165 P < 0.05).

Polymorphism of diplotypes
The results of allelic form recognition to mutation loci showed that the CYP2D6 gene in 150 cases could be de ned as 33 diplotypes (Table 3). Among them, only one diplotype (A_l) was non-mutation homozygous in 12 SNPs, the other 32 diplotypes showed mutant heterozygous at least one locus (A_21). In the 32 mutant diplotypes, the allelic form of 96 mutation loci accumulated in 12 loci were mutant homozygous accounted for 72.9% (70/96). Although there was no regularity in the number of mutation loci between the SR group and the NR group, the diplotype A_17 showed mutations at as many as 7 loci (c. 100, c. 271, c. 281, c. 294, c. 297, c. 408 and c. 1457), and the diplotype came from a patient suspected of relapsed vivax malaria for 3 times. In addition, The diplotypes A_10 and A_12 showed mutant homozygous at c.408, c.1457 and c.408, c.886, c.1457, respectively, and that the frequency of SR group was signi cantly higher than that of NR group (Fisher's Exact Test P < 0.05) ( Table 3). But the frequency of diplotype A_30 in the NR group was signi cantly higher than that in the SR group (Fisher's Exact Test P < 0.05) ( Table 3). Hap_4
To our knowledge, the current study is the rst one that attempts to analyze mutation polymorphism in the whole coding region of human CYP2D6 gene, thereby verifying its possible association with thwarted CYP2D6 enzyme activity in relapsed vivax malaria patients after primaquine treatment. Among the 24 haplotypes de ned, Hap_2 makes up the largest proportion of the sequences, reaching as high as 36.6% (55/150). This haplotype shows multiple mutations at four loci, including c.100C > T, c.336C > T, c.408G > C, c.1457G > C. These mutations are different from previous studies which reported that CYP2D6*10 allele is frequently detected in Asian populations [32,34]. CYP2D6*10 allele has multiple mutations at c. 100(C > T), c. 408(G > C) and c. 1457(G > C), yet is devoid of c. 336C > T mutation as opposed to Hap_2. Such disparity might be attributed to fact that the sequencing of the whole gene coding region can easily detect all base substitutions. However, the factor of different population should be taken into account to determine whether such differences in common alleles of CYP2D6 gene are worthy of in-depth research. The population studied in this paper is not randomly selected, but is composed of vivax malaria patients residing in endemic areas, and 50% of them (75/150) had relapsed malaria. Thusly, the genotypes of such population may show the accurately selective outcome of human defective genes after the screening of malaria epidemic.
Although decreased enzyme activity in the heterogeneity of CYP2D6 is speculated to lead to failed therapy of primaquine to kill hypnozoite of P. vivax [9,17,[35][36][37], the correlation between different genotypes and the relapse of P. vivax is poorly understood. In this study, we found that the genotypes with combined mutations at c. 408, c. 886 and c. 1457 were most closely related to the relapse of vivax malaria (OR = 5.615, P < 0.05) ( Table 3), and that mutation at c. 886 locus is the key indicator that determines vivax malaria relapse. Such ndings are not consistent with that of Silvino et al [37], which believes that c. 100, c. To date, direct prediction method based on CYP2D6 locus mutation polymorphism, which was summarized by Zanger et al [27] and indirect prediction method of genotype scoring, which was recommended by Gaedigk et al [28] and Caudle et al [31], are applied to predict CYP2D6 enzyme activity based on genetic information. Since the genotype scoring method can quantify the degree of gene mutation and abides by more abundant genotypic criteria, this method was used in the study. However, no matter Hap_6 indicates high risk of relapse or Hap_4 functions as the protective factors, we predicted that all the genotypes are above IM level. CYP2D6*2 is the allele type that indicates highest risk of relapse, and most of the non-relapsed patients had CYP2D6*10 allele. This result is not in line with that of Brasil et al [36], which reported that most of the alleles in relapsed patients are CYP2D6*10, CYP2D6*39 and so forth. Moreover, it contradicts the clinical observation of relapsed patients in this study. Such con ict of outcome might be justi ed that the prediction method of CYP2D6 enzyme activity used in this study focuses on analyzing the whole gene coding region information, thereby taking more alleles into account and leading to enhanced genotype score. In contrast, Brasil et al., detected only a few mutation loci and therefore could miss out the reverse effects of other locus mutations. CYP2D6*2 allele shows heterogeneous phenotypes exhibiting both normal and abnormal enzyme activity [39]. Therefore, identifying the heterogeneous CYP2D6 with similar genotypes is the key to predicate the enzyme activity of CYP2D6 [20,28], regardless of the method used for detection. Meanwhile, the prediction method should be further optimized, because the effect of primaquine resistance in patients with relapsed vivax malaria is not eliminated.
Although this study applied rst-generation sequencing method to analyze polymorphism in the coding region of CYP2D6 gene, the strategy of triplicated parallel repetition was adopted for each PCR reaction and the sequencings of its products. Such effort helped eliminate the rate of error in single PCR sequencing to a certain extent. Meanwhile, the cost-effective sequencing analysis method is more suitable for molecular epidemiological cross-sectional investigation of CYP2D6 gene mutation in the endemic areas where primaquine is used to intervene vivax malaria. However, this study has some limitations.
Firstly, the sample size is not large enough, and the low-frequency alleles of CYP2D6 genes detected in the study should be further veri ed. Secondly, all the non-relapsed patients were selected from 2018, which is not equivalent to the relapsed patients enrolled for four consecutive years. Sampling bias may affect the experimental results due to changes of various factors over the years. In addition, other confounding factors which may amplify the identi cation of relapsed vivax malaria patients, such as the existence of primaquine-resisted Plasmodium, are not ruled out, despite the identi cation of relapsed cases has been achieved by comparing the genetic origin of infected strains. Finally, CYP2D6 enzyme activity was analyzed by assigning the activity score of zygotic allelic, a method recommends by Caudle et al [31], yet this study de ned more CYP2D6 gene locus polymorphisms. Since there is no benchmark standard to scores the activity of these loci polymorphisms, the prediction of CYP2D6 enzyme activity in this study may not be completely homogenous with the ndings of Caudle et al.

Conclusion
In conclusion, the prediction of CYP2D6 enzyme activity based on whole gene coding region sequencing preliminarily reveals CYP2D6 gene mutation polymorphisms in relapsed P. vivax patients who showed poor curative effect of primaquine in Yunnan. This study makes up for the analysis of genetic correlation factors in relapsed P. vivax patients in China, and reaches the following ndings: (1) Among the multiple mutations of CYP2D6 gene, combined mutations at three loci, including c.408, c.886 and c.1457, are most closely related to decreased CYP2D6 enzyme activity. Mutations at c.408 and c.1457 induce compromised CYP2D6 enzyme activity, yet a larger sample size is need to validate whether such decreased activity is aggravated by c.886 mutation; (2) The key to predict CYP2D6 enzyme activity is to de ne the genotype accurately and accordingly, based on the genetic correlation between genotype and enzyme activity phenotype with reasonable sample size. Because the whole gene sequencing method can fully identify the mutation loci, it is a promising molecular epidemiological method for investigating CYP2D6 gene mutation polymorphism.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.