Quality control
The demographic data of 321 newborns are summarized in Table 1. The average sequencing depth of the 321 samples was 47.42 (28.84 ~ 82.90), and the average coverage was 99.48 (99.01 ~ 99.89) (Supplementary Fig. 1).
Table 1
Summary of demographic data collected from 321 Qingdao cohort
| Type | Number | Percentage |
Pregnancy | Natual pregnancy | 306 | 95.4% |
Assisted reproduction | 11 | 3.4% |
Unknown | 4 | 1.2% |
Gestational weeks | Premature birth | 7 | 2.2% |
Delivery pregnancy week | 39 weeks plus 5 days | |
SD | 9.47 | |
Gender of newborns | Boys | 151 | 47% |
Girls | 170 | 53% |
Parental age at delivery | Father’s age (ave.) | 33 | |
Mother’s age (ave.) | 32 | |
SD of father’s age | 5 | |
SD of mother’s age | 4 | |
Inherited diseases
We chose to analyse only the 109 genes underlying the 61 diseases listed in the recommended Uniform Screening Panel (RUSP). 121 pathogenic or likely pathogenic variants were detected in 321 children (30.53% of the samples), mainly in a heterozygous state (Table 2). Thirtyone inherited diseases were asscociated with the 121 pathogenic and likely pathogenic variants, while Hearing Loss was the most common disease. Twentyone newborns carried more than two genetic variants (Supplementary Table 4).
Table 2
Overview of the heritable diseases identified in 321 newborn children from Qingdao
No. | Core Condition | Inheritance | Gene | Variants | Classification* | No. of Variants | Het/Hom |
1 | Propionic Acidemia | AR | PCCB | c.1364A > G | P | 1 | Het |
c.793G > A | LP | 1 | Het |
2 | Methylmalonic Acidemia (methylmalonic-CoA mutase) | AR | MUT | c.2179C > T | P | 1 | Het |
3 | Methylmalonic Acidemia (Cobalamin disorders) | AR | MMAA | c.658G > A | LP | 1 | Het |
MMACHC | c.315C > G* | P | 1 | Het |
c.445_446del* | P | 1 | Het |
c.482G > A* | P | 2 | Het |
c.609G > A* | P | 3 | Het |
c.658_660del* | P | 3 | Het |
c.80A > G* | P | 1 | Het |
MMADHC | c.748C > T* | P | 1 | Het |
4 | 3-Methylcrotonyl-CoA Carboxylase Deficiency | AR | MCCC1 | c.639 + 2T > A | P | 1 | Het |
5 | Holocarboxylase synthase Deficiency | AR | HLCS | c.782del | P | 2 | Het |
6 | Glutaric Acidemia Type I | AR | GCDH | c.1213A > G | P | 1 | Het |
7 | Carnitine Uptake Defect/Carnitine Transport Defect | AR | SLC22A5 | c.1472C > G | P | 4 | Het |
c.468G > A | P | 1 | Het |
8 | Medium-chain Acyl-CoA Dehydrogenase Deficiency | AR | ACADM | c.548_551del | P | 2 | Het |
9 | Trifunctional Protein Deficiency | AR | HADHB | c.1175C > T | LP | 1 | Het |
10 | Citrullinemia, Type I | AR | ASS1 | c.352G > A | LP | 1 | Het |
11 | Homocystinuria | AR | MMADHC | c.748C > T* | P | 1 | Het |
12 | Classic Phenylketonuria | AR | PAH | c.1301C > A* | LP | 1 | 7Het; 1compound Het |
c.611A > G* | P | 2 |
c.728G > A* | P | 4 |
c.740G > T* | P | 1 |
c.842 + 2T > A* | P | 1 |
13 | Primary Congenital Hypothyroidism | AR,AD | TSHR | c.1349G > A | P | 4 | Het |
AR | DUOX2 | c.1588A > T | P | 6 | Het |
c.1883del | P | 1 | Het |
c.1946C > A | LP | 1 | Het |
c.3329G > A | P | 1 | Het |
AR | TPO | c.2422del | P | 1 | Het |
14 | Congenital adrenal hyperplasia | AR | CYP21A2 | c.518T > A | P | 2 | Het |
c.844G > T | P | 1 | Het |
c.92C > T | P | 2 | Het |
15 | S, βeta-Thalassemia | AR,AD | HBB | c.126_129del* | P | 1 | Het |
16 | Cystic Fibrosis | AR | CFTR | c.2052_2053insA | P | 1 | Het |
17 | Classic Galactosemia | AR | GALT | c.821-7A > G | P | 1 | Het |
AR | GALT | c.844C > G | LP | 1 | Het |
18 | Glycogen Storage Disease Type II (Pompe) | AR | GAA | c.2237G > C | P | 1 | Het |
c.2662G > T | P | 1 | Het |
c.2647-7G > A | LP | 1 | Het |
19 | Hearing Loss | AD,AR,DD(Digenic dominant) | GJB2 | c.109G > A | P | 14 | 26Het; 2compound heterozygote |
c.235del | P | 10 |
c.299_300del | P | 4 |
c.605_606insAGAAGACTGTCTTCACAGTGTTCATGATTGCAGTGTCTGGAATTTG | P | 2 |
AR | SLC26A4 | c.1174A > T | P | 1 | Het |
AR | c.1229C > T | P | 1 | Het |
AR | c.1262A > C | LP | 1 | Het |
AR | c.2027T > A | LP | 2 | Het |
AR | c.2168A > G | P | 1 | Het |
AR | c.919-2A > G | P | 2 | Het |
AR | USH2A | c.2802T > G | P | 1 | Het |
c.99_100insT | P | 1 | Het |
c.8559-2A > G | P | 1 | Het |
20 | Severe Combined Immunodeficiencies | AR | ADA | c.424C > T | P | 1 | Het |
c.872C > T | P | 1 | Het |
AR | JAK3 | c.1744C > T | LP | 1 | Het |
c.307C > T | LP | 1 | Het |
21 | Spinal Muscular Atrophy due to homozygous deletion of exon 7 in SMN1 | AR | SMN1 | - | P | 6 | Het |
No. | Secondary Condition | Inheritance | Gene | Variants | Classification | No. of Variants | Het/Hom |
22 | Methylmalonic acidemia with homocystinuria | AR | MMACHC | c.315C > G* | P | 1 | Het |
c.445_446del* | P | 1 | Het |
c.482G > A* | P | 2 | Het |
c.609G > A* | P | 3 | Het |
c.658_660del* | P | 3 | Het |
c.80A > G* | P | 1 | Het |
AR | MMADHC | c.748C > T* | P | 1 | Het |
23 | Short-chain acyl-CoA dehydrogenase deficiency | AR | ACADS | c.1031A > G | P | 1 | Het |
24 | Glutaric acidemia type II | AR | ETFDH | c.1211T > C | LP | 1 | Het |
25 | Citrullinemia, type II | AR | SLC25A13 | c.1180 + 1G > A | P | 1 | Het |
c.852_855del | P | 2 | Het |
26 | Hypermethioninemia | AR | GNMT | c.149T > C | LP | 1 | Het |
27 | Benign hyperphenylalaninemia | AR | PAH | c.1301C > A* | LP | 1 | Het |
c.611A > G* | P | 2 | Het |
c.728G > A* | P | 4 | Het |
c.740G > T* | P | 1 | Het |
c.842 + 2T > A* | P | 1 | Het |
28 | Biopterin defect in cofactor biosynthesis | AR | PTS | c.166G > A | P | 1 | Het |
c.259C > T | P | 1 | Het |
c.84-291A > G | P | 3 | Het |
29 | Biopterin defect in cofactor regeneration | AR | PTS | c.166G > A | P | 1 | Het |
c.259C > T | P | 1 | Het |
c.84-291A > G | P | 3 | Het |
30 | Various other hemoglobinopathies | AR | HBB | c.126_129del* | P | 1 | Het |
31 | Galactoepimerase deficiency | AR | GALE | c.505C > T | P | 1 | Het |
Notes: Recommended Uniform Screening panel as of July 2018 (https://www.hrsa.gov/advisory-committees/heritable-disorders/rusp/index.html). The classification of pathogenicity is based on the ACMG guidelines, referring to Clinvar annotation and literature reported. Variants with asterisk (*) were associated with more than one disease. |
Three children with compound heterozygous variants were detected, predicted computationally to be pathogenic or likely pathogenic in the 321 subjects. Two children had variants at the GJB2 gene (NM_004004.5, c.109G > A, p.V37I; NM_004004.5, c.235del, p.L79Cfs3 and NM_004004.5, c.109G > A, p.V37I; NM_004004.5, c.299_300del, p.H100Rfs14), and one carried two variants at the PAH gene (NM_000277.1, c.611A > G, p.Y204C and NM_000277.1, c.842 + 2T > A). Sanger sequencing confirmed that one variant was inherited from her/his mother. However, as infant father’s sample was not available, we could not determine if the small deletion and insertion was inherited from the father or whether it was a de novo variant (Supplementary Fig. 1). Follow-up of the three families confirmed that one child with compound heterozygosity in PAH has been diagnosed with PKU, while the other two children with GJB2 variation have not shown characteristics of hearing loss yet. Previous studies report that homozygous or compound heterozygous variants of c.109G > A are associated with light to mild deafness, and show incomplete penetrance, which can lead to late-onset deafness (16, 17). After genetic counceling, the two children with GJB2 variants were therefore scheduled to undergo hearing testing every six months.
Primary immunodeficiency diseases
The IUIS summary information of PID genes (15) was used to identify potential variants in 151 immunodeficiency associated genes. Altogether 11 heterozygous pathogenic/ likely pathogenic variants in eight genes were identified in 11 of the 321 newborn children (Table 3). However, all of these variants were detected in heterozygous state, no child was found to carry homozygous or compound variants in the immunodeficiencies genes that are recorded as being recessive gene in IUIS summary.
Table 3
Overview of the IUIS recommended PIDs identified in 321 newborn children from Qingdao
No. | Condition | Inheritance | Gene | Variants | Classification* | No. of Variants | Het/Hom |
1 | Adenosine deaminase (ADA) deficiency | AR | ADA | c.424C > T | P | 1 | Het |
c.872C > T | P | 1 | Het |
2 | Ataxia-telangiectasia | AR | ATM | c.67C > T | P | 1 | Het |
3 | MOPD1 deficiency (Roifman syndrome) | AR | CLASP1 | c.196-562G > A | P | 1 | Het |
4 | Immunoskeletal dysplasia with neurodevelopmental abnormalities (EXTL3 deficiency) | AR | EXTL3 | c.1970A > G | P | 1 | Het |
5 | EDA-ID due to NEMO/IKBKG deficiency (ectodermal dysplasia, immune deficiency) | XLR | IKBKG | c.518G > A | LP | 1 | Het (Female carrier) |
6 | JAK3 deficiency | AR | JAK3 | c.1744C > T | LP | 1 | Het |
c.307C > T | LP | 1 | Het |
7 | DNA ligase IV deficiency | AR | LIG4 | c.1271_1275del | P | 1 | Het |
8 | TACI deficiency( Immunodeficiency, common variable) | AD/AR | TNFRSF13B | c.542C > A | LP | 2 | Het |
P: Pathogenic |
LP: Likely pathogenic |
Of note, we found a girl carries a heterozygous variant in IKBKG, which is associated with X-linked NEMO deficiency (ectodermal dysplasia, immune deficiency). Two children carried a likely pathogenic variant (NM_012452.2, c.542C > A, p.A181E) in TNFRSF13B, which has been reported to be associated with primary immunodeficiency, in particular common variable immunodeficiency although with a low penetrance. The A181E variant has been identified in affected patients in heterozygous, homozygous and compound heterozygous states, and the genetic pattern of the disease is marked as both autosomal dominant and autosomal recessive in the OMIM database. It should be noted however, that the variant has also been seen in normal individuals, and it has been found in 603/24,990 (2.4%) Finnish alleles and 1,540 of 282,092 (0.55%) alleles of different ethnic backgrounds according to the gnomAD database, whereas the prevalence of CVID is around 1/10–50,000 in North America and Europe (18).
Pharmacogenomics
Gene-drug selection was based on published PGx criteria, of which the Clinical Pharmagogenetics Impletation Consortium (CPIC) (19) and the Dutch Pharmacogenetics Working Group (DPWG) (20) guidelines are the most widely recognized. The CPIC and DPWG guidelines both provide clinical recommendations to patients with a known genotype. Minor difference exists between them (20), as the methodologies and clinical practice vary among countries. However, up to now, there is no recommendation on prioritizing which PGx tests should be recommended to an individual beforehand. Recently, the DWPG has developed the Clinical Implication Score (CIS) (20) with the goal to set up a guideline for which drugs testing of specific genetic variants is warranted. The CIS is translated into a three-category recommendation for testing: Essential, Beneficial and Potentiallly Beneficial (Supplementary Table 4).
In this study, we only focused on the gene-drug pairs according to the DPWG Essential category. We observed that every newborn in the Qingdao cohort carried at least one clinically relevant variant (Fig. 1) of the Essential PGx genes. Among the gene-drug pairs, CYP2D6 had the highest variant carrying rate (Table 4), where 266 out of 321 infants carring at least one relevant variant. In total, 150 infants carried one copy of *10 (rs106585), while 81 infants carried two copies of *10, suggesting that at least 25% infants have a decreased function of CYP2D6 in Codeine metabolism (21, 22). Gene CYP2C19 showed the second highest variation carrying rate, 209 out of 321 infants carring at least one clinically relevant variant. Newborns carrying homozygous variants at the CYP2C19 gene with subtypes *2/*2 and *3/*3 are 25 and 1 respectively, which would lead to lack of enzyme activity and a low metabolization of clopidogrel via the CYP2C19 pathway (23). In addition, 133 and 122 infants carried variants at UGT1A1 and NUDT15 respectively. Homozygous variants at UGT1A1 (24, 25) and NUDT15 (26, 27) result in a reduced metabolism of Irinotecan, Azathioprine, Mercaptopurine and Tioguanine. No clinical related variation was detected at DPYD (Table 4).
Table 4
Overview of PGx gene-variant statistics in 321 and an allele frequency comparison by population
Gene-drug Pairs | Number of Carriers in the QD cohort | Number of QD cohort | MAF* |
Drug | Gene | Allele variation | dbSNP RS ID | Het | Hom | Total | Total carriers related to the gene | Qingdao | EAS | SAS | AFR | EUR/Cau-casian | AMR |
Azathioprine, Mercaptopurine, Tioguanine | NUDT15 | *3 | rs116855232 | 74 | 5 | 79 | 122 | 321 | 13.08% | 9.52% | 6.95% | 0.08% | 0.20% | 4.47% |
*4 | rs147390019 | 1 | 0 | 1 | 0.16% | 0.10% | 0.00% | 0.00% | 0.00% | 0.72% |
*5 | rs186364861 | 4 | 0 | 4 | 0.62% | 1.39% | 0.10% | 0.00% | 0.00% | 0.00% |
*6 | rs554405994 | 36 | 2 | 38 | 6.23% | 4.76% | 0.20% | 0.15% | 0.30% | 3.89% |
Irinotecan | UGT1A1 | *6 | rs4148323 | 106 | 15 | 121 | 133 | 321 | 21.18% | 13.79% | 1.74% | 0.08% | 0.70% | 1.15% |
*27 | rs35350960 | 12 | 0 | 12 | 1.87% | 1.39% | 0.00% | 0.00% | 0.00% | 0.00% |
Clopidogrel | CYP2C19 | *2 | rs4244285 | 141 | 25 | 166 | 209 | 321 | 29.75% | 31.25% | 35.79% | 17.02% | 14.51% | 10.52% |
*3 | rs4986893 | 33 | 1 | 34 | 5.45% | 5.56% | 1.23% | 0.23% | 0.00% | 0.00% |
*4A/B | rs28399504 | 1 | 0 | 1 | 0.16% | 0.10% | 0.00% | 0.00% | 0.10% | 0.29% |
*5 | rs56337013 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
*6 | rs72552267 | 1 | 0 | 1 | 0.16% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
*8 | rs41291556 | 0 | 0 | 0 | 0.00% | 0.00% | 0.10% | 0.08% | 0.30% | 0.00% |
*9 | rs17884712 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.98% | 0.00% | 0.14% |
*10 | rs6413438 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.15% | 0.00% | 0.14% |
*17 | rs12248560 | 7 | 0 | 7 | 1.09% | 1.49% | 13.60% | 23.52% | 22.37% | 11.96% |
Codeine | CYP2D6 | *3 | rs35742686 | 0 | 0 | 0 | 266 | 321 | 0.00% | 0.00% | 0.20% | 0.23% | 1.89% | 0.58% |
*4 | rs3892097 | 5 | 0 | 5 | 0.78% | 0.20% | 10.94% | 6.05% | 18.59% | 12.97% |
*6 | rs5030655 | 0 | 0 | 0 | 0.00% | 0.00% | 0.10% | 0.08% | 1.99% | 0.29% |
*8 | rs5030865 | 10 | 1 | 11 | 1.87% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% |
*9 | rs5030656 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.08% | 2.58% | 1.30% |
*10 | rs1065852 | 150 | 81 | 231 | 48.60% | 57.14% | 16.46% | 11.27% | 20.18% | 14.84% |
*14A/B | rs5030865 | 0 | 0 | 0 | 0.00% | 0.99% | 0.00% | 0.00% | 0.00% | 0.00% |
*17 | rs28371706 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 21.79% | 0.20% | 0.86% |
*41 | rs28371725 | 18 | 1 | 19 | 3.12% | 3.77% | 12.17% | 1.82% | 9.34% | 6.20% |
Fluorouracil, Capecitabine, Tegafur + DPD-inhibitor | DPYD | *2A | rs3918290 | 0 | 0 | 0 | 0 | 321 | 0.00% | 0.00% | 0.82% | 0.08% | 0.50% | 0.14% |
*13 | rs55886062 | 0 | 0 | 0 | 0.00% | 0.00% | 0.00% | 0.00% | 0.10% | 0.00% |
X | rs67376798 | 0 | 0 | 0 | 0.00% | 0.00% | 0.10% | 0.08% | 0.70% | 0.29% |
X | rs56038477 | 0 | 0 | 0 | 0.00% | 0.00% | 1.94% | 0.08% | 2.39% | 0.58% |
Notes: The gene-drugs pairs refer to the DPWG “Essential” category. MAF data refers to 1000 Genome phase 3 dataset. NA indicates no available data from 1000 Genome dataset and can thus not be detected by the current pipeline. |
We further investigated the differences in allele frequency between the Qingdao cohort dataset and five subpopulations of the 1000 Genome dataset, including East Asians (EAS), South Asians (SAS), Africans (AFR), Europeans (EUR), and Americans (AMR). In most cases, the allele frequency of the Qingdao cohort is consistent with the EAS dataset, while the other four subpopulations differ significantly(Table 4).