Consensus genotypes generated in 96 samples for CYP2D6 and 93 samples for CYP2C19 to date resulted in revision of assigned enzyme activity score for 28/96 (29%) and 2/93 samples (2.2%) for CYP2D6 and CYP2C19, respectively (sample selection enriched for structural variants in CYP2D6). These changes in assigned activity score were due to both changed genotype assignments and new genotype assignments for samples that were “no calls” on AmpliChip (Fig. 3). For CYP2C19, the highest concordance with consensus genotype was in the Luminex and PharmacoScan data (100%). Data from Luminex, Agena, TaqMan, AmpliSeq, PharmacoScan, and AmpliChip were 100% concordant for the CYP2C19*2 and CYP2C19*17, the most common loss-of-function and gain-of-function haplotypes, respectively, in individuals of European ancestry. No adjustments in the prior AmpliChip data were therefore required for either of these haplotypes; prior clinical association analyses conducted on the basis of these CYP2C19 haplotypes are therefore valid (Huezo-Diaz et al., 201223; Fabbri et al., 201838).
For CYP2D6, all technologies other than the AmpliChip were able to reliably detect the CYP2D6*5. Haplotype phasing of CYP2D6xNs was achieved by using relevant TaqMan assays on genomic DNA (Fig. 2), or by genotyping an amplicon specific for the XN. Although using allelic ratios to cluster TaqMan genotype data leaves a degree of uncertainty around genotypes (e.g., if only one probe amplifies, it may not be possible to distinguish between C/C, CC/C, CC/- and C/-), this technique can be used effectively to distinguish different heterozygote groups (Fig. 2). A strength of the sample set was the availability of prior AmpliChip data including haplotype phasing of CYP2D6xNs. The haplotype phasing thus achieved with our methods was consistent with the prior data, where available. One sample was genotyped as having a multiplication (i.e., more copies than 2), specifically of CYP2D6*41, which has been previously described29,39. The majority of the revisions in assigned enzyme activity score were due to the inability of AmpliChip to detect hybrids (Supplementary Table 7) and the inconsistency of CYP2D6*5 detection by AmpliChip.
A focus of recent research on CYP2D6 is the hybrid haplotypes26,40−42. Samples with CYP2D6-2D7 or CYP2D7-2D6 hybrid genes were identified through multiple methods including genotyping the L-PCR amplicons specific for CYP2D7-2D6 hybrids (CYP2D6*13 variants) using the Luminex CYP2D6 assay. The resultant data were consistent with the specific CYP2D6*13 variant sequences to which these samples had been aligned through Sanger sequencing. It is therefore possible that it might not be necessary to conduct Sanger sequencing of amplicons to identify such hybrid variants: screening for hybrids using CNV probes and multiplex SNV detection by methods such as Luminex or AmpliSeq may be sufficient. Our CYP2D6 haplotype translator (Supplementary Table 1) was able to identify some hybrid tandems.
Our cross-technology comparisons suggest the following approach for efficient genotyping of CYP2D6: a multiplex SNV and CNV assay, haplotype phasing, and L-PCRs with sequencing where necessary (Fig. 1). Appropriate positive controls (e.g., from the Genetic Testing Reference Material Program (GeT-RM)43, especially for the haplotypes that we did not see in this European sample set and which might be found in other ethnic groups, should be run with the assays. All of these assays other than the downstream processing of amplicons could be run in parallel, and the downstream processing of amplicons using a multiplex CYP2D6 assay efficiently conducted. Different technologies have their strengths and weaknesses, particularly in regard to coverage of CYP2D6 CNVs. A strategy that would be even more comprehensive, to include novel variants in non-coding regions, would include sequencing of the CYP2D6-CYP2D7 intergenic region and the CYP2D6 downstream region.
Limitations of this work include the following. Firstly, we have not covered de novo variants. Secondly, the work was conducted in a set of samples from European individuals being treated for depression, with samples being selected as being representative for genotypes available in the whole set and with enrichment for CYP2D6 structural variants. As such, we did not find CYP2D6 haplotypes that would be more commonly found in other ethnic groups, such as *29, and therefore although the technologies were able to identify this haplotype, as none were detected in our data, we were not able to validate the detection thereof. However, of note, there are reference samples available with this haplotype from the Genetic Testing Reference Material Program (GeT-RM)43. Thirdly, theoretically it is possible that our CNV detection methods resulted in false positive calls for copy number loss in introns 2 and 6, owing to sequence variation in the relevant regions44. However, as we used three different technologies (AmpliSeq, Pharmacoscan, and TaqMan), covering probes in multiple regions of CYP2D6 in addition to introns 2 and 6, and subjected any putative hybrid haplotypes to L-PCR and Sanger sequencing, we do not think this is a significant concern.
We suggest supplementing the CYP2D6 SNV coverage described herein with TaqMan assays as follows a) for haplotyping XNs (13 assays including a custom assay for an exon 9 conversion SNV for CYP2D6*36, 6 of which we have already used to demonstrate methodology); b) to extend coverage to haplotypes of > 1% in any ethnic group (6 assays)17,18; c) another CNV assay, for the CYP2D6 5’ region; d) an assay (C__29692254_10) for rs5758550, which tags a region 100 kb downstream from CYP2D6 proposed to act as an enhancer 45. We have previously validated the C__29692254_10 assay against PharmacoScan data (concordance 100%)46–48. However, the functional consequence of this SNV on a background of a broad range of different haplotypes is at present unknown49. The CYP2C19 haplotypes included in the Luminex assay (CYP2C19*2-*10, and *17) cover the CYP2C19 change-of-function haplotypes currently identified in all ethnic groups at a frequency > 1%19,20 apart from CYP2C19*13, *15, *35, and *18. There are TaqMan assays available for the first three, while for the CYP2C19*18 g.80156G > A SNV, a custom assay is required. For PharmacoScan, the four haplotypes are already covered. We also suggest an additional TaqMan assay for the c.463G > T variant (rs374036992) that may be found on the CYP2C19*17 haplotype and introduces a premature stop codon50, and an assay to enable CYP2C19*17 haplotype phasing50. These suggestions for CYP2C19 cover a greater range of haplotypes than suggested by previous authors51, although we have not yet included coverage of the recently discovered CYP2C19 structural variants15. TaqMan assays for both CYP2D6 and CYP2C19 can be efficiently multiplexed.
Our suggested strategy for CYP2D6 and CYP2C19 genotyping is based on the data available to us. There might be other strategies (such as single molecule long read sequencing52) or modified versions of various technologies that could work just as well. Considerations for choice of methodology include expected batch sizes of samples, cost and ability to add novel haplotypes. We have derived haplotype translation files for both CYP2D6 and CYP2C19 to enable efficient genotyping, adaptable for use with multiple technologies27. Our suggested genotyping strategy resulted in more accurate phenotype assignment for CYP2D6 and CYP2C19 in a subset of the GENDEP pharmacogenomic clinical trial, justifying extension of this work to the remainder of this dataset. In addition, the strategy with the additional assays suggested above provides a method for comprehensive detection of haplotypes above a frequency of 1% in any ethnic group for clinical implementation.