Inherited kidney diseases, including ARPKD are leading causes of CKD and ESKD in children in Oman, leading to significant morbidity and mortality. Previous studies from Oman have provided ARPKD-associated morbidity data but lacked molecular genetic data [10, 14]. In this study we have provided a clinical and molecular genetic analysis of PKHD1 in a cohort of 40 patients, obtaining a molecular genetic diagnosis in 38 patients with clinically suspected ARPKD.
Most study patients had early onset ARPKD disease reflected by age at initial diagnosis. Five were diagnosed prenatally, 24 before their first year of life and 11 during childhood. These early-onset phenotypes are in agreement with that reported from other studies . Clinical analysis of our ARPKD patients showed that the frequently associated morbidities were also common in our patients including systemic hypertension, congenital hepatic fibrosis, splenomegaly, pulmonary hypoplasia and CKD. 15% of studied patients died during the perinatal or neonatal period due to respiratory deficiency, which is similar to the death rate during the first year of life reported by Bergmann . It is estimated that 30-50% of ARPKD patients die shortly after birth due to respiratory failure, whereas kidney failure is a rare cause of neonatal death . With the advancement in renal replacement therapy modalities, the survival rate of neonates and children with ARPKD is improved. In our patients, 12 (30%) developed ESKD by median age of 13 years and hence required either renal replacement therapy (n=8) or kidney transplantation (n=4) (Table 2).
As the PKHD1 is a large gene and in order to identify the causative mutations in our ARPKD patients, we had previously applied targeted NGS gene panel for 18 unrelated patients (Table 2) . Four missense mutations in PKHD1 were identified as genetic causes of ARPKD in this cohort, with the mutations identified within exons 3, 6, 32 and 58. Therefore, we proceeded with a targeted exon PCR diagnostics approach with Sanger sequencing of these exons alone for the molecular diagnosis of other ARPKD patients (n=20) from 14 different families. In total, 30 out of 32 suspected ARPKD families were solved with biallelic changes in PKHD1, achieving a diagnostic rate of 94%, hence providing cost effective targeted PCR analysis of these specific alleles as a convenient diagnostic tool. Failure to detect mutations in two unrelated probands using the Sanger screening approach alone may be explained by the heterogeneity of the ARPKD disease where mutations may lay in other exons or indeed in other recessive cystogenes that phenocopy ARPKD. None of the unsolved patients had have family history of kidney disease, therefore mutations in autosomal recessive genes that lead to ARPKD-like phenotypes such as the DZIP1L  or even in dominant cystic kidney disease genes such as HNF1B, PKD1 and PKD2 that often occur de novo are possible . Whole exome sequencing approaches that would allow inclusion of genes such as DZIP1L are the suggested option for these unsolved patients. With the current improvements of high throughput sequencing of different renal ciliopathy genes, it is anticipated that the majority of patients with cystic kidney disease phenotypes can receive a precise molecular genetic diagnosis.
To date, 748 unique PKHD1 variants have been recorded in the Human ARPKD/PKHD1 Mutation Database (http://www.humgen.rwth-aachen.de/index.php). Approximately 45% of these variants are missense alterations resulting in substitution of conserved amino acids, which usually leads to partial or complete dysfunction of fibrocystin. All of the PKHD1 variants in this study are missense alterations of highly conserved amino acids. Although three of these variants are reported in the ARPKD / PKHD1 Mutation Database, p. (Thr36Met) is the most persistent mutation found in PKHD1 in ARPKD patients to date. Structurally, fibrocystin is an integral membrane protein consisting of a large amino terminal extracellular domain (about 3,860 aa) containing various glycosylation sites, a single transmembrane segment and a short cytoplasmic C-terminal tail (about 195 aa) comprising four potential protein kinase A phosphorylation sites  (Supplementary Figure S2B). The localization of fibrocystin to primary cilia and its integral structure predicted a sensory role at which fibrocystin acts as receptor transducing the extracellular information into the cell through stimulation of signal cascades, thus controlling cell-cell adhesion and proliferation . The 4 missense variants identified in this study are located in exons encoding the extracellular domain (Supplementary Figure S2C) and are either very rare or not observed in the reference databases of healthy controls (including ExAC, gnomAD, 1000G project (Table 3)).
In the Omani population our previous NGS studies did not find PKHD1 truncating mutations that have been described in patients with perinatal lethal phenotypes [15, 18]. The most frequent change identified in Omani families was p.(Thr36Met) located in exon 3, detected homozygously in 16 families and heterozygously in 8 families, accounting for almost 75% of the families. Patients with homozygous p.(Thr36Met) change (n=18) had an earlier age of onset and an increased severity of disease (6 had a severe perinatal presentation and died before one year of life, while the remaining had either infantile or early childhood presentation leading to ESKD) (Table 2). Consistent with observations made by Bergman et al , the p.(Thr36Met) variant may lead to variability in the age of onset and severity of disease. Additionally, p.(Thr36Met) in combination with the missense changes p.(Thr136Ala) and p.(Arg1624Trp) in some of our cases caused a relatively severe form of ARPKD, which is in agreement with previous reported studies [19-21].
The relatively common pathogenic PKHD1 allele p.(Thr36Met) has been described in many populations and ethnicities. Whether this allele is a highly conserved ancestral change that is frequent in some populations such as the Central European population [19, 22], or caused by recurrent mutational events is uncertain . The p.(Thr36Met) allele appears to be common in European genomes, with an expected carrier frequency of 1:412 . However, detections of this change in patients from different ethnicities and origins are suggestive that p.(Thr36Met) is a PKHD1 ‘hotspot’ mutation caused by the frequent methylation events of cytosine to thymine in the CpG sites [19, 24]. It is also assumed that the substitution of the amino acid Threonine to Methionine creates a potential alternative translation start codon that may be even stronger than the original start codon . The protein product initiating from position c.107 would be predicted to lead to complete loss of protein function due to improper protein folding .
The missense change, p.(Arg1624Trp), was previously reported in patients from different ethnicities, including Caucasian Americans [24, 25], Dutch , Czech Republicans , Slovenians , Saudi Arabians [24, 28, 29] and Kuwaitis . The p.(Arg1624Trp) mutation has been described with late onset or older ARPKD presentations when present homozygously [24, 28] and heterozygously in trans with other truncating or missense change [24, 27]. In contrast, 6 of our patients with the p.(Arg1624Trp) mutation developed clinical features of ARPKD in infancy, 5 presented during childhood period and 1 at 11 years of age. The p.(His3124Tyr) combined with p.(Arg1624Trp) was found in a 26 year old patient with stage 4 CKD, who was initially diagnosed with polycystic kidney disease in early childhood (Table 2). These findings are in contrast to those made by Bergmann et al  and Gunay-Aygun  that correlated p.(His3124Tyr) with a severe perinatal-fatal phenotype.
These results therefore demonstrate that establishing genotype-phenotype correlations in ARPKD is challenging. Any correlation is complicated by the large number of missense variants distributed over the entire length of the coding exons of PKDH1 and its complex splicing pattern . It was believed that two truncating mutations are associated with severe perinatal lethality and at least the presence of one missense is required for survival beyond the neonatal period. However, evidence is accumulating on the increased pathogenicity of some missense mutations that may cause complete loss of function effects . The wide variability in ARPKD severity among patients may in part be explained by differences in PKHD1 mutations, influences of modifiers genes and environmental factors .
ARPKD is generally a severe form of pediatric ciliopathy with recognized phenotypic variability. While a significant number of ARPKD patients surviving the neonatal period reaches adulthood, some patients have an adulthood presentation and their kidney function ranges from normal to moderate kidney insufficiency to ESKD . Although bilateral kidney enlargement with multiple cysts is the major clinical characteristic, liver manifestations may lead to symptomatic disease complications in ARPKD patients. Liver disease tends to manifest later than kidney disease typically with progressive hepatic fibrosis and portal hypertension . Hypersplenism, portal hypertension, and variceal bleeding are major liver involvements that may develop as a result of progressive liver fibrosis. In rare cases, both kidney and liver disease may present in late adolescence or in adulthood . The low prevalence, limited clinical information and atypical sonographic pattern of adult ARPKD patients can challenge the clinical diagnosis and management, hence genetic testing may be demanded for the establishment of definite diagnosis . In this study, the absence of late presenting ARPKD in our cohort may be due to recruiting predominantly cystic kidney disease phenotypes.
The Omani population is characterized by a unique structure of tribal communities occupying definite geographical regions. This structure is conserved over many generations and has created genetic isolates . The custom of consanguineous marriages as well as within-tribe (endogamous) marriages are extremely conserved in Oman, accounting for 56.3%  and 20.4% of total marriages, respectively . Over 300 genetic diseases have been identified in the Omani population . The high frequency of recessive disorders in this population is probably related to a combination of genetic drift, consanguinity, and geographical isolation. The detection of only 4 pathogenic variants in different geographical regions of the country may be explained by the presence of PKHD1 founder alleles and reveals a high degree of homogeneity in this population. Similarly, genetic studies of population isolates such as Finnish, French, Ashkenazi Jews and Africans represent a powerful method of finding founder mutations in PKHD1, which can be utilized for efficient diagnostic testing of at-risk individuals and pregnancies in these populations (Supplementary Table S3) [19, 39-41].
Currently there is no clinical cure for ARPKD other than managing the clinical complications . Together translational research and clinical trials in patients may facilitate successful drug development in coming future. With the absence of clinical biomarkers and lack of comprehensive assessment of the available therapeutic options for ARPKD patients on one hand and great morbidity and mortality of disease on the other hand, there is a serious need for prospective and retrospective population studies and construction of an international clinical database. Such effort can elaborate the current understanding of ARPKD and deliver more information on extrarenal manifestations and treatment options. Recently, the German Society for Pediatric Nephrology (GPN) and the European Study Consortium for Chronic Kidney Disorders Affecting Pediatric Patients (ESCAPE) collaborated to initiate an international multicenter registry of ARPKD (ARegPKD) . The continued identification of PKHD1 variants and their associated phenotypes is to be promoted and inclusion of cohorts from different ethnicities is valuable and should be encouraged.