Characterization of PAH Gene Mutations and Analysis of Genotype-Phenotype Correlation in Patients with Phenylalanine Hydroxylase Deciency from Fujian Province, Southeastern China

Phenylalanine hydroxylase deficiency (PAHD) is the most prevalent inborn error of amino acid metabolism in China, has a complex phenotype with many variants and genotypes among different populations. Here, we analyzed the phenylalanine hydroxylase （ PAH ） gene mutations in a cohort of 93 PAHD patients from Fujian Province. And, the analysis of genotype and phenotype correlation in patients with PAHD was also determined. 44 different pathogenic variants were identified, including five novel variants. The three most prevalent mutations among all patents were p.Arg53His (18.03%), p.Arg241Cys (14.75%), and p.Arg243Gln (7.65%). The frequency of the p.Arg53His variant was the highest in patients with mild hyperphenylalaninemia (MHP), while the frequency of the p.Val399= and p.Arg111Ter variants was the highest in patients with classic phenylketonuria(cPKU). The most abundant genotypes observed in PAHD patients were p.Arg53His/p.Arg243Gln, p.Arg53His/p.IVS4-1G>A, and p.Arg53His/p.Arg241Cysp. As for the genotype-phenotype prediction, the APV/GPV system performed well in predicting the actual phenotype, as the overall consistency rate was 85.71% for PAHD patients. In conclusion, we established a PAH gene mutation spectrum in the PAHD patients in Southeastern China. A quantitative correlation analysis between genotype and phenotype severity is helpful for genetic counseling and management.

PAH catalyzes the hydroxylation of L-phenylalanine (L-Phe), forming L-tyrosine using tetrahydrobiopterin (BH4) as a cofactor. Mutations in the PAH gene lead to a loss of enzyme activity and an increase in serum concentrations of Phe. Based on the severity of the metabolic phenotype, PAHD is classified into three types: classic phenylketonuria (cPKU), mild PKU (mPKU), or mild hyperphenylalaninemia (MHP). Abnormal accumulation of serum Phe can damage the peripheral and central nervous systems, resulting in mental retardation, seizures, and cerebral palsy to varying degrees if left untreated [6]. Therefore, early diagnosis and treatment of PKU are of great importance to avoid permanent damage.
The human PAH gene comprises 13 exons and 12 introns, which is located on chromosome 12q23.2. To date, more than 1,200 different mutations in PAH gene have been reported in patients with PAHD and have been registered in the locus-specific database Phenylalanine Hydroxylase Gene Locus-Specific Database (PAHdb; http://www.pahdb.mcgill.ca). The detection of PAH mutations and the analysis of the correlation between the genotype and clinical phenotype are extremely valuable for genetic counseling, the selection of the most suitable therapeutic options, and prognosis prediction [7][8][9][10][11].
Although several genotype-based methods for phenotype prediction have been applied for this disease, their predictive value needs to be further clarified [12][13][14][15]. In addition, it is important to establish a spectrum of PAH mutations in a population-specific manner because of the great regional and ethnic heterogeneity of PAH mutations. Up to now, there has been little overall study on PAHD population genetics focusing on the Chinese Han population in Fujian Province, Southeastern China.
In this study, we analyzed the PAH mutations in a cohort of 93 PAHD patients from Fujian Province using next-generation sequencing (NGS) and Sanger sequencing with the aim of characterizing the distribution of PAH mutations in this region. Additionally, we analyzed the genotype and phenotype correlation in patients with PAHD.

Study subjects
A total of 93 children (56 males and 37 females) diagnosed at the Medical Genetic Diagnosis and Therapy Center of Fujian Province Maternal and Child Health Hospital between January 2016 and September 2021 were included in this study. All patients and their parents were of the Chinese Han population.
All patients were identified through a neonatal hyperphenylalaninemia screening program. We applied tandem mass spectrometry to measure plasma phenylalanine concentrations from dried blood samples before treatment was started. All patients had plasma Phe levels >120 μmol/L, and Phe:Tyr ratios >2. Additionally, a urinary pterin analysis and dihydropteridine reductase activity assay were performed on dried blood spot samples to exclude patients with tetrahydrobiopterin reductase deficiency. Based on their plasma Phe concentrations, the patients were diagnosed with cPKU (Phe ≥ 1,200 μmol/L), mPKU (Phe: 360-1200 μmol/L), or MHP (Phe: 120-360 μmol/L) [16].

Genotype analysis
Genomic DNA was isolated using the QIAamp DNA Mini Kit (Qiagen, Germany), following the manufacturer's instructions. NGS was used to detect the PAH gene variants by Biosan (Zhejiang, China). Then, the detected mutations were further confirmed using Sanger sequencing. To determine sequence variability, the genes of the respective parents were screened for the variables detected in their offspring.
The obtained sequences were compared with the wild-type transcript of human PAH (NM_0002777) to identify potential mutations. Mutation nomenclature followed the guidelines and recommendations of the Human Genome Variation Society (http://varnomen.hgvs.org/). Gene variants were classified according to the ACMG guidelines (https://clinicalgenome.org/).
To exclude polymorphic sites in the population, all detected gene mutations were tested against the 1000 Genomes Project, dbSNP, and ExAC databases.

Genotype-Phenotype prediction
Two different algorithms were used to analyze the correlation between genotype and phenotype. Based on the in vitro residual activity associated with the PAH variants [12], the gene variants were predicted to cause any of the three phenotypes: cPKU (21.1% ± 7.0%), mPKU (40.2% ± 7.6%), and MHP ( 52.1% ± 8.5%). In addition, the allelic phenotype values (APV) of the PAH variants was queried in the BIOPKU database. Phenotypes were predicted using genotypic phenotype values (GPVs) [13], which were equal to the higher APV alleles: (i) GPV = 0-2.7 for cPKU, (ii) GPV = 2.8-6.6 for mPKU, and (iii) GPV = 6.7-10 for MHP [10]. Of note, we could not predict the phenotype of one mutation that did not have an APV score in the BIOPKU database. Finally, We also defined these alleles as null alleles, such as frameshift, splice-site, and nonsense mutations .

Prediction of genotype-phenotype correlations in PAH-deficient patients
A total of 84 patients met the requirements for phenotype prediction based on APV/GPV system analysis. As shown in Table 4, this analysis accurately  (Table 4).

De novo variants pedigrees
When the PAH gene variants of the patients were confirmed, paternity testing was subsequently carried out to determine the biological nature of the relationship between the patient and parents, with the aim of confirming that the variants were inherited from parents.

Novel sequence variants
Five novel variants that were not recorded in the BIOPKU database were detected in this study, comprising three missense mutations (c.103A>G  Table 5.

Discussion
In this study, 93 patients with PAHD that had been identified via national newborn screening in the past 5 years were included. The mutation detection rate was 98.39% (183/186). The high detection rate of gene mutations is consistent with previous studies confirming the universality of PAH gene mutations. Variants in six exons (7, 2, 11, 12, 6, and 9) accounted for 84.57% of the total variants, which is consistent with previous reports in other regions of China and Asia [15,[19][20][21][22]. The similarities in the variant spectra of East Asian populations support the viewpoint that human evolution and migration are similar in these countries. A total of 44 distinct mutations were detected, nearly half of which were observed only once, indicating a high degree of genetic heterogeneity among the PAHD population in Fujian Province. Nine mutations represented 51.36% of the total among the PAHD patients, the most common of which were p.Arg53His and p.Arg241Cys. However, previous studies reported that p.Arg243Gln and EX6-96A>G variants were the most prevalent variants in Chinese populations [5,23]. This inconsistency may be related to the high proportion of MPH patients included in this study and the existence of specific alleles among different regions.
The results of this study show that six mutation sites presented frequencies significantly different among the three phenotypes of patients. The variants p.Arg53His and p.Phe392Ile had a higher frequency in MPH patients (P = 0.001 and 0.014, respectively), while p.Arg241Cys and p.Arg408Gln had a higher frequency in mPKU patients. In addition, p.Arg53His variant was more frequent MHP patients, in line with the results of previous large-sample studies [3,17]. In vitro studies have reported that the residual activity of the p.Arg53His-type PAH enzyme was equivalent to 79% of that of the wild-type [12,24]. Whether the variant p.Arg53His is polymorphic is controversial. The findings of a previous study suggested that patients with PAHD carrying p.Arg53His did not require frequent Phe monitoring, unlike those with PKU [25].
In this study, we found that the p.Arg53His variant is closely associated with MHP. In line with these results, additional previous studies have classified the p.Arg53His variant as "likely benign" [3,4,26]. Finally, the proportion of patients with MHP in Fujian province was significantly higher than that reported by the BIOPKU database and previous reports from other regions in China. This might be due to the higher frequency of the p.Arg53His variant in this region. Finally, the p.Phe392Ile variant was only observed in patients with MHP, in line with the results of previous studies [3,17].
Previous studies showed that the detection rate of p.Arg241Cys in mPKU and MHP patients from southern China was higher than that in other areas [3,27,28]. The results of this study are consistent with those of previous studies [27,28], suggesting that regional and demographic differences may be represented to a certain degree in our study. Although the Arg241 is localized close to the cofactor binding region, this amino acid is not directly involved in the interaction with the cofactor [27] [29]. This may be related to the high proportion of patients with MHP in this study. The high proportion of patients with MHP may also represent regional and ethnic heterogeneity. Interestingly, two p.Arg241Cys homozygous patients showed a MHP phenotype, which was also found in previous studies [3,19].
Previous studies have attempted to uncover a quantitative correlation between genotype and phenotype in PAHD patients using a series of different algorithms [13,14,[30][31][32]. APV and GPV have been identified as systems with high sensitivity and specificity for phenotypic prediction [4,13]. In this study, the APV/GPV-based prediction system performed well in predicting the actual phenotype, as the overall consistency rate was 85.71% for PAHD, and 100% MHP. These findings strongly support the close correlation between the genotypes and phenotypes underlying PAHD. These results also provide basic and valuable data for the prenatal diagnosis and prevention of PAHD. In contrast, the prediction system based on the average value of residual activity of the two alleles in vitro did not perform well; with a prediction accuracy rate of only 65.82% for PAHD, and 42.42% for mPKU. Genotypic and phenotypic inconsistencies existed mainly in patients that had compound heterozygous variants. The mechanism underlying these discrepancies requires further clarification. It has been reported that co-expression of different PAH gene variants might lead to a residual activity different than the predicted activity due to intermolecular interactions [33]. However, the accuracy of these two prediction systems in predicting PKU needs to be further improved. Therefore, further studies should optimize these systems to accurately predict phenotypes.
In conclusion, we constructed a PAH gene mutation spectrum of the PAHD patients of Fujian Province and identified novel mutations that broaden the PAH gene mutation spectrum. Exploring and clarifying the differences in the frequencies of different PAH gene variants among patients with different sub-phenotypes may help to understand the correlation between compound heterozygosity and phenotype. In addition, we demonstrated that genotype-phenotype prediction using the APV/GPV system resulted in a higher prediction accuracy than when the results of residual enzyme activity in vitro were used, illustrating that APV/GPV prediction could be a suitable tool for genetic counseling of PAHD families.
However, this study also presented some limitations. Three patient samples with only one PAH gene variant were not further assessed by multiplex ligation-dependent probe amplification analysis. In addition, the APV/GPV prediction system should be validated using a large number of mutations to verify its accuracy in the future. Forty-four variants were identi ed in PAHD patients from Fujian Province. Novel variants identi ed in this study are depicted in red. E: exon; I: intron.