SNV/indel detection
After screening, a total of 37 PKU patients with unknown genotypes were included. Their information, genotypic and phenotypic characteristics are summarized in Table 1.
Table 1. Genotypes and phenotypes of 37 PKU families.
Patients
|
Age
|
pre-treatment Phe levels (μmol/L)
|
PAH allele 1
|
PAH allele 2
|
Classification
|
1
|
4y9d
|
606
|
c.1065+241C>A
|
c.208_210del (p. Ser70del)
|
mPKU
|
2
|
1m17d
|
2,340
|
c.1199+502A>T
|
c.442-1G>A
|
cPKU
|
3
|
5m20d
|
2,622
|
c.1030G>A (p. Gly344Ser)
|
c.1199+502A>T
|
cPKU
|
4
|
1m
|
1,548
|
c.740G>T (p. Gly247Val)
|
c.1199+502A>T
|
cPKU
|
5
|
7y
|
NA*
|
c.1068C>A (p. Tyr356Ter)
|
c.1199+502A>T
|
cPKU
|
6
|
2m14d
|
540
|
c.992T>C (p. Phe331Ser)
|
NA
|
mPKU
|
7
|
6m12d
|
796
|
c.1065+241C>A
|
c.728G>A (p. Arg243Gln)
|
mPKU
|
8
|
3m5d
|
2,099
|
c.929C>T (p. Ser310Phe)
|
c.1199+502A>T
|
cPKU
|
9
|
1m21d
|
696
|
c.1199+502A>T
|
c.722G>A (p. Arg241Cys)
|
mPKU
|
10
|
1m7d
|
3,660
|
c.331C>T (p. Arg111Ter)
|
c.1199+502A>T
|
cPKU
|
11
|
2m2d
|
792
|
c.707-59C>G
|
c.1084C>A (p. Pro362Thr)
|
mPKU
|
12
|
17m
|
1,686
|
c.1199+502A>T
|
c.611A>G (p. Tyr204Cys)
|
cPKU
|
13
|
1m17d
|
456
|
c.1301C>A (p. Ala434Asp)
|
c.1065+241C>A
|
mPKU
|
14
|
1m13d
|
900
|
c.1199+502A>T
|
c.1045T>G (p. Ser349Ala)
|
mPKU
|
15
|
3y3m
|
2,700
|
c.1199+502A>T
|
c.781C>T (p. Arg261Ter)/c.1256A>G (p. Gln419Arg)
|
cPKU
|
16
|
1m6d
|
414
|
c.1023G>C (p. Lys341Asn)
|
c.1065+241C>A
|
mPKU
|
17
|
6y10m
|
936
|
c.728G>A (p. Arg243Gln)
|
c.1065+241C>A
|
mPKU
|
18
|
1m8d
|
828
|
c.1162G>A (p. Val388Met)
|
NA
|
mPKU
|
19
|
2m5d
|
780
|
NA
|
NA
|
mPKU
|
20
|
1m10d
|
840
|
c.482T>C (p. Phe161Ser)
|
c.1065+241C>A
|
mPKU
|
21
|
26d
|
528
|
c.707-59C>G
|
c.208_210del (p. Ser70del)
|
mPKU
|
22
|
1m5d
|
1,140
|
c.1199+502A>T
|
c.332G>A (p. Arg111Gln)/c.1301C>A (p. Ala434Asp)
|
mPKU
|
23
|
30d
|
1,494
|
c.740G>T (p. Gly247Val)
|
c.1199+502A>T
|
cPKU
|
24
|
1m2d
|
720
|
c.913-7A>G
|
c.1199+502A>T
|
mPKU
|
25
|
8y3m
|
1,506
|
c.1065+241C>A
|
c.331C>T (p. Arg111Ter)
|
cPKU
|
26
|
1m
|
1,920
|
c.706+629A>C
|
c.526C>T (p. Arg176Ter)
|
cPKU
|
27
|
4m3d
|
2,058
|
c.158G>A (p. Arg53His)/c.842+2T>A
|
c.1199+502A>T
|
cPKU
|
28
|
1m20d
|
540
|
c.1065+241C>A
|
c.331C>T (p. Arg111Ter)
|
mPKU
|
29
|
1m9d
|
2,274
|
c.755G>A (p. Arg252Gln)
|
NA
|
cPKU
|
30
|
1m14d
|
1,560
|
c.611A>G (p. Tyr204Cys)
|
c.1065+241C>A
|
cPKU
|
31
|
21d
|
360
|
c.251A>G (p. Asp84Gly)
|
c.1199+502A>T
|
mPKU
|
32
|
1m2d
|
2,160
|
c.1199+502A>T
|
c.158G>A (p. Arg53His)/c.842+2T>A
|
cPKU
|
33
|
1y8m
|
1,140
|
c.728G>A (p. Arg243Gln)
|
c.1199+502A>T
|
mPKU
|
34
|
1m6d
|
450
|
c.284_286delTCA (p. Ile95del)
|
c.1065+241C>A
|
mPKU
|
35
|
16y
|
1,860
|
c.158G>A (p. Arg53His)/c.842+2T>A
|
c.1199+502A>T
|
cPKU
|
36
|
1m10d
|
2,160
|
c.1199+502A>T
|
c.442-1G>A
|
cPKU
|
37
|
2m11d
|
558
|
c.728G>A (p. Arg243Gln)
|
c.1065+241C>A
|
mPKU
|
NA: Not identified
cPKU: classic PKU(Phe ≥1,200μmol/L); mPKU: mild PKU(Phe 360~1,200 μmol/L)
* Patient 5 was clinically diagnosed as classic PKU by musty odor from skin and urine, fair skin and intellectual disability.
After single-gene full-length sequencing, 74 potential disease-causing variant alleles were identified. A total of 33 patients were completely genotyped, of which 28 carried compound heterozygous variants and 5 harbored three separate variants. Compared with the previous results, the detection rate of PKU increased from 94.6% (650/687) to 99.4% (683/687), an increase of approximately 5%. However, there were still four patients with unclear genotypes.
Among the results from 33 patients identified by full-length sequencing, all of the newly detected variant alleles were in the deep introns, including c.707-59C>G, c.1065+241C>A, c.1199+502A>T and a novel variant c.706+629A>C (Table 2). The most frequent variant was c.1199+502A>T (57.6%), followed by c.1065+241C>A (33.3%) and c.707-59C>G (6.1%).
Table 2. Deep intronic variants identified by full-length sequencing of PAH.
Nucleotide aberration
|
Location
|
Frequency of detection in PKU patients
|
Proportion of mPKU
|
Proportion of cPKU
|
Variant classification
|
c.706+629A>C
|
intron6
|
3.0% (1/33)
|
-
|
100% (1/1)
|
Hot uncertain significance (PM3, PP4_Moderate, PM2_Supporting)
|
c.707-59C>G
|
intron6
|
6.1% (2/33)
|
100% (2/2)
|
-
|
Likely pathogenic (PM3_Strong, PP4_Moderate, BS1)
|
c.1065+241C>A
|
intron10
|
33.3% (11/33)
|
81.8% (9/11)
|
18.2% (2/11)
|
Pathogenic (PM3_VeryStrong, PP4_Moderate, PM2_Supporting)
|
c.1199+502A>T
|
intron11
|
57.6% (19/33)
|
31.6% (6/19)
|
68.4% (13/19)
|
Pathogenic (PM3_VeryStrong, PP4_Moderate, PM2_Supporting)
|
The novel variant c.706+629A>C was identified in a patient with cPKU. After family verification, it formed a compound heterozygous with c.526C>T (p.Arg176Ter). This variant is not present in population databases (gnomAD no frequency). In silico analysis by RESCUE-ESE and ESEfinder, it predicts this variant is probably damaging to the protein structure. But these predictions have not been confirmed by functional studies. Therefore, it has been classified as a hot uncertain significance[16] (PM3, PP4_Moderate, PM2_Supporting).
Combined with the phenotypic analysis, 68.4% of the PKU patients with c.1199+502A>T were associated with cPKU, while those with c.1065+241C>A preferred mPKU. The variant c.707-59C>G was identified in two patient with mPKU. According to ClinGen PAH Expert Panel Specifications for interpretation of genetic variants, these deep intronic variants were classified as likely pathogenic or pathogenic.
Identification of the large-scale deletion/duplication of PAH
We selected five patients (in Table 3) and detected four kinds of deletion and duplication variants by MLPA, involving exons 4, 5, 6, 12, 1 and upstream (Supplementary figure 1). Full-length sequencing data showed the genomic regions of PAH gene deletions or duplicates in these samples. By comparison with the PAH reference sequence, we obtained the exons and introns contained in these regions. The results of analysis of the five samples were consistent with MLPA analysis.
Table 3. Deep intronic variants identified by full-length sequencing of PAH.
Patient
|
Age
|
Pre-treatment Phe levels (μmol/L)
|
PAH allele 1
|
PAH allele 2
|
MLPA result
|
Full length sequencing result (involved exon and intron)
|
38
|
1m
|
2,254.80
|
exon4-5 deletion
|
NC_000012.11: g.103256126-103272397 del (exon4-5)
|
c.1162G>A(p.Val388Met)
|
39
|
1m3d
|
2,484
|
exon6 deletion
|
NC_000012.11: g.103248768-103249219 del (exon6)
|
c.478C>T(p.Gln160Ter)
|
40
|
1y9m
|
1,428
|
exon1 and upstream deletion
|
NC_000012.11: g.103311316-103315071del (exon1 and upstream)
|
c.1238G>C(p.R413P)
|
41
|
2m12d
|
768
|
exon1 and upstream deletion
|
NC_000012.11: g.103311023-103312086 del (exon1 and upstream)
|
c.158G>A(p.Arg53His)
|
Distribution of PAH gene variant types
Through the supplementary detection of single-gene full-length sequencing, 683 of 687 PKU patients were completely genotyped. The variant types of the fully genotyped patients are summarized in Table 4. A total of 612 (89.6%) patients carried all variants in exons and flanking intron regions. We detected these variants by conventional sequence analysis, including Sanger sequencing and gene panels by NGS and whole-exome sequencing (WES). Thirty-three (4.8%) patients carried deep intronic variants that were identified by whole-genome sequencing (WGS) and single-gene full-length sequencing. In addition, 38 (5.6%) harbored the large-scale deletion/duplication that was detected by gene-targeted deletion/duplication analysis. Notably, among the molecular genetic testing used in PKU, only WGS and single-gene full-length sequencing detected all the above variants.
Table 4. Distribution of variant types and molecular genetic testing in PKU.
Variant types
|
Proportion of Probands
|
Molecular Genetic Testing
|
SNV/Indel
|
94.4%(645/683)
|
Sequence analysis
|
in exons and flanking intron regions
|
89.6%(612/683)
|
sanger sequencing, gene panel, WES, WGS, single-gene full-length sequencing
|
in deep introns
|
4.8%(33/683)
|
WGS, single-gene full-length sequencing
|
PAH deletion/duplication
|
5.6%(38/683)
|
Gene-targeted deletion/duplication analysis: quantitative PCR, long-range PCR, MLPA, WGS, single-gene full-length sequencing
|
Characteristics of variants in deep introns and untranslated regions
After our manual screening, there were seven nonbenign variants of deep introns in the PAHvdb database. Three of them were also identified in our study. In addition, in our study, we found a novel variant c.706+629A>C that had not been reported previously. Therefore, we analyzed a total of eight variants listed in the Supplementary table 1.
Among these variants, seven were detected in Asian populations, including Chinese and Iranian populations. In terms of the location of the variants, all variants were located in the catalytic domain (Figure 2).