Integrating population variation and protein structural analysis to improve the clinical genetic diagnosis and treatment in children with congenital nephrogenic diabetes insipidus

Background and Objectives: Congenital nephrogenic diabetes insipidus (NDI) is a rare genetic disorder characterized by renal inability to concentrate urine. Establishing the genetic diagnosis appears particularly important to NDI for early detection and differential diagnosis. Method: We utilized a Chinese multicenter registry to investigate genotype and phenotype in children with NDI from 2014 to 2019. The structural locations of the pathogenic mutations from this study and the literature, as well as population variants retried from gnomAD were analyzed. Results: A total of 10 boys from 9 families carried mutations in AVPR2 (8/10) or AQP2 (2/10). Another 7 relatives of the families were diagnosed by sequencing for partial or subclinical NDI. Patients presented with dehydration, polyuria-polydipsia, and severe hypernatremia with a median age at diagnosis of 1.0 month (IQR 0.16, 18). Protein structural analysis revealed a notable clustering of diagnostic mutations in the transmembrane regions of AVPR2, and enrichment of diagnostic mutations by autosomal dominant inheritance (AD) in the C terminal region of AQP2. The pathogenic mutations are signicantly more likely to be buried inside the domain comparing the population variants. Through structural analysis and in silico prediction, the eight mutations identied in this study were considered as presumably disease causative. The most common treatments were thiazide diuretics and non-steroidal anti-inammatory drugs (NSAIDs). Emergency treatment of hypernatremic dehydration in neonates should not choose the isotonic saline as a rehydration uid. Conclusion: Genetic analysis presumably conrmed the diagnosis of NDI in every patient of the studied cohort. A plea of early identifying NDI conrmed by phenotype and genotype, and consequently optimize the treatment.

Integrating population variation and protein structural analysis to improve the clinical genetic diagnosis and treatment in children with congenital nephrogenic diabetes insipidus Panli

Introduction
Congenital form of nephrogenic diabetes insipidus (NDI) a rare inherited disorder, characterized by insensitivity of the distal nephron to the antidiuretic action of arginine-vasopressin (AVP) and the reduced ability of the kidney to concentrate the urine, leading to severe dehydration and electrolyte imbalance (hypernatremia and hyperchloremia) 1 . Inheritance is X-linked in 90% of patients due to mutations in the gene coding for the vasopressin type 2 receptor (AVPR2) (OMIM #304800) 1,2 . The remaining patients have autosomal recessive or dominant forms due to mutations in the gene coding for the water channel aquaporin 2 (AQP2) (OMIM #222000; OMIM #125800) 1,3 . The main clinical hallmarks of NDI are polyuria and compensatory polydipsia. Upon inadequate water supply, a hot environment, or episodic losses of free water, patients suffering from NDI do not properly compensate water loss and are at risk of severe dehydration. The urine concentrating defect is present at birth, and symptoms arise during the rst week of life as irritability, poor feeding, and failure to thrive 4 . Persistent polyuria can lead to the development of kidney megacystis, hydroureter, and hydronephrosis. Repeated episodes of dehydration can cause mental retardation 5 , which is a serious complication of NDI, probably secondary to hypoxic episodes 6 .
Establishing the genetic diagnosis appears particularly important for NDI for early detection and differential diagnosis in view of its unique associated features and long-term complications.
We utilized a multicenter strategy to investigate the genotype and phenotype in a cohort of Chinese children clinically diagnosed with NDI in the current study. The description of clinical and genetic spectrum of Chinese children with NDI will help to devise a strategy for gene analysis appropriate for our population.

Materials And Methods
Study design and participants

Genetic analysis
In order further identify and con rm the diagnosis, we performed the trio whole exon sequence (Trio-WES) including proband and parents concurrently after the informed consent was obtained from his parents. Samples were subjected to Whole Exome Sequencing (WES) in outsourcing, and raw data were transferred to our lab for the bio-informatic analysis. FastQ raw data were analyzed by FastQC software to check the quality in terms of length and GC content of reads, quality of nucleotides within the reads,

Patient Characteristics
A total of ten boys from nine families with registry information on CCGKDD database from 2014 to 2019 were enrolled in this study. Patient characteristics were shown in Table 1. The median age at clinical diagnosis was 1.0 months (IQR 0. 16,18). There were three cases with delayed diagnosis who were clinically diagnosed after 2 years old of age. One of them was identi ed the mutation of AQP2 until developing into chronic kidney disease (CKD) 3 stage at 14 years old. Polyuria-polydipsia and hypernatremia were presented in all the ten probands. Five patients were diagnosed during the neonate period. And four of them had fever, vomiting and anorexia at the initial presentation. Even ve families had multiple affects, only one family applied for early genetic screening for the newborn baby because of his affected sibling with NDI. Five patients had a water restriction test before genetic detection. Eight patients underwent cranial MRI as part of their diagnostic evaluation without any abnormalities. Hydronephrosis was found in four children. No patient had hypercalciuria or abnormal renal function checking for the serum chemistries at initial presentation.

Genetic analysis
We detected six different mutations of AVPR2 and two mutations of AQP2 in the study cohort. Four of them were previously described mutations (AVPR2: p.L62P 10 , p.A165D 11 , and p.S167L 12 ; AQP2: p.R187C 13 ), and four novel mutations were observed (AVPR2:p.F77del, p.S331Rfs*25 and small deletion of exon2; AQP2: p.R267fs*66). In total, four were missense, three were small deletions resulting in frameshift mutations and the last one was a gross deletion of an exon (Table 1 and Figure. 1).
Pathogenicity predicted with in silico algorithms was show in Table 2. We also detected seven female relatives of heterozygous mutations and two maternal uncles of hemizygous mutations in the AVPR2 gene. Among them, we found that three suffered polydipsia and polyuria, and another four female individuals presented partial or subclinical NDI.

Variants analysis and comparative protein modeling
The AVPR2 protein is a typical seven membrane-spanning helices G protein-coupled receptor (GPCR)  Fig. 1A, C). The Residue Depth of the disease-causing mutation is signi cantly higher than that of the population SNPs and that of all variants from gnomAD (Wilcoxon-Mann-Whitney test, P = 2e-8 and P = 2e-5 respectively, Fig. 2A,B ).
Structural analysis of the AVPR2 revealed the presence of hemizygous mutations in eight patients (Fig. 1A). The three missense mutations of AVPR2 (p.L62P p.A165D and p.S167L) produce full length misfolded proteins with residues located within the transmembrane domain, mostly retain in the endoplasmic reticulum (ER) by the ER quality-control machinery and target for proteasome degradation. The deletion mutation p.F77del results in the residue de ciency in the rst intracellular loop (located in a well-conserved region), probably causes a defective protein. The small deletion p.S331Rfs*25 causes a frame shift with a premature stop codon encountered in the sequence. This de ciency of C-terminal tail of AVPR2 loses the binding of β-arrestin to the phosphorylated tail of AVPR2 and subsequently receptor internalization allowing the engagement of G protein to the core of AVPR2. The forming of megaplex AVPR2-G protein-β-arrestin is required for the active signals leading to sustained endosomal cAMP generation 15 . The gross deletion of exon 2 leading to truncated proteins, which are often rapidly degraded.
The AQP2 gene is located on chromosome 12q13 and codes for the 271 amino acid AQP2 protein, a type IV-A transmembrane protein characterized by six transmembrane domains connected by ve loops and intracellular N-and C-termini 16 . A total of 53 missense mutations and 9 small deletion in AQP2 were reported, and there are a further 70 population missense variants retrieved from the gnomAD database.
The location of disease-causing and population variants was scattered throughout the different structural domains of the protein, with no signi cant enrichment of disease-causing variants in transmembrane regions (Fisher test, P = 0.079, Fig. 1B). However, ve of nine small deletion in AQP2 affected the residues from 223 to 271 on C terminal cytoplasmic region. There was signi cantly enrichment of disease-causing variants by autosomal dominant inherited (AD) in the C terminal region (Pearson test, P = 0.0001, Fig. 1D). The Residue Depth of the disease-causing mutation is signi cantly higher than that of the population SNPs and that of all variants from gnomAD (Wilcoxon-Mann-Whitney test, P = 2e-7 and P = 2e-5 respectively, Fig. 2C, D).
Structural analysis of the AQP2 revealed the presence of mutations in two patients (Fig. 1B). The homozygous mutation p.R187C affects amino acids in the selectivity lter region of the water conduction pore which determines the transport speci city 17 . ER accumulation of these AQP2 mutants has been shown in several studies 4

Treatment Regimens
All the babies diagnosed with NDI during the neonate need emergency treatment of hypernatremic dehydration in this study. Most of them received isotonic or hypotonic rehydration before the clinical diagnosis of NDI. Such was observed in our case example ID#3 whose plasma Na increased from 153 to 161 mmol/l after rehydration with intravenous 0.19% saline for the neonate. A tonicity balance can easily demonstrate the excess of NaCl administration with 0.19% saline (Fig. 3A, B). In this case, the appropriate uid use of 5% glucose which provides the free water lost was prescribed at maintenance uid rates appropriate to his age and size, pending adjustment according the plasma Na. And the priority was breastfeeding for the hypotonic advantage compared with formula feed.
Most of our patients started the conventional treating with thiazides after clinical diagnosis of NDI, followed by non-steroidal anti-in ammatory drugs (NSAIDs). Thiazides were prescribed to all the ten patients, a thiazide and NSAID in four (4/9) patients. During the follow up, NSAIDS were discontinued for the following reasons including general concerns about long-term use (3/4), and increased serum creatinine (1/4). Medium serum sodium at initial treatment was 154 mmol/L (IQR 145, 161). At last follow-up, median age was 5.0 years old (IQR, 2.6, 8.3) and median serum sodium 144 mmol/L (IQR 140, 145) (Fig. 3C). At the time of last follow-up only one case (#2) with AQP2 mutation developed into CKD 3 stage whose father and maternal grandfather passed away with renal failure.

Discussion
This study reported the genetic spectrum and treatment approaches in a cohort of children with NDI registered on the Chinese multicenter database. We reported eight cases of seven families presented with X-linked NDI (AVPR2), one presented with autosomal recessive NDI (AQP2) and one autosomal dominant NDI (AQP2).
Although the pathophysiology and molecular diagnosis of congenital polyuric states has been well established 19 , we still encounter cases where the diagnosis is late and where inappropriate diagnostic testing and treatments are done. Optimizing the diagnostic strategy including genetic analysis is one of our top priorities. As is known disease-causing genes for NDI, it has been well established the heterozygous loss of function variants of AVPR2 or AQP2 result in congenital NDI 2,3,20 . In order to try to elucidate the pathogenicity of variants of AVPR2 or AQP2, we compared population SNPs with pathogenic mutations from HGMD. Although in principle the existence of a single population variant dose not rule out pathogenicity, it is unlikely that the observed population variants of AVPR2 or AQP2 are pathogenic, since severe early-onset childhood disorders have speci cally been excluded from gnomAD. We evaluated the variants in the 3D domain structure encoded by AVPR2 or AQP2 to nd the positional correlation with pathogenicity. There is notable clustering in AVPR2 3D structure, with pathogenic mutations more likely to be within transmembrane region. Such clustering in transmembrane region was not shown for AQP2. An enrichment of diagnostic mutations by autosomal dominant inherited (AD) was found in the C terminal region of AQP2. Nonetheless, the pathogenic missense mutations of AVPR2 or AQP2 were signi cantly more likely to be buried within the domain. Systemic analysis of the protein structure and variants allowed us to make strong predictions about likely pathogenic variations both in AVPR2 and in AQP2. One of the limits of our study is the lack of functional studies, especially in case of the novel variants detected in the NDI cohort. So, the mutations identi ed in our study were considered as presumably disease causative post the protein structural analysis and in silico prediction.
As next generation sequencing is increasing applied in both research and clinical settings, more and more variants will be discovered in known disease genes as well as in novel disease genes. Although in silico predictions alone should not be relied on as the sole basis to determine the clinical signi cance of variants in proteins, we hope that the analysis used in this study provide useful structural evidence for variants interpretation. Moreover, combing clinical and population genetics with protein structural analysis offers widely applicable in silico method for improving the clinical interpretation of novel missense variation.
Massive free water losses cause signi cant morbidity, even in treated patients 6 . High uid intake is necessary to avoid hypernatremic dehydration, which can result in permanent neurologic complications.
Most emergency protocol suggests initial treatment with 0.9% saline 21 . While the situation in different in NDI, because of the ongoing loses of essentially pure water with the urine. Infusion of 0.9% saline will result in excess sodium chloride administration and thus worsen the hypernatremia as observed in our case treating with 0.19% saline. Thus, children with NDI should be treated with hypotonic uids, either enterally (water/milk) or, if need be, intravenously (5% dextrose in water) 6,22 . Of course, hypotonic uids must never be administered as an intravenous bolus. Emphasis on the infusion rate which only slightly exceed the urine output is to safely normalize plasma sodium concentration at a rate of less than 0.5 mmol/l/h (10-12 mmol/l/day). The main risk of a rapid decrease in plasma sodium is cerebral edema and potentially death. Yet, the concern of this complication and misunderstanding of NDI would lead to administration of excess salt with a risk of rapid increase in plasma Na, leading to osmotic demyelination 6 . The diagnosis of congenital NDI can either be missed or misunderstood, leading to dangerous mistreatment. It is quite important to early identify NDI with phenotype and genotype, and consequently optimize the treatment.
Hypernatremia in children with NDI will induce a strong thirst behavior and, when asked, the parents of patients with NDI often provide the typical history of an extremely thirsty child, who so avidly drinks large amounts of water, that it often vomits afterward. Older babies will typically want to preferentially drink water and avoid food. Patients are also at risk for nocturnal enuresis, hydronephrosis, and poor growth, presumably due to the need for high water consumption that interferes with adequate caloric intake. In our cohort, ve neonates were diagnosed early with symptoms including fever, vomiting and anorexia. Even ve families had multiple affects, only one family applied for early genetic screening for the newborn baby because of his affected sibling with NDI. Beside the indexes, we identi ed the three more males diagnosed of NDI by sequencing of AVPR2 and AQP2, and another four female individuals presented partial or subclinical NDI. Con rming the clinical diagnosis with genetic screening allows the early diagnosis and management of at-risk members of families with identi ed mutations 6,19,23 .
Our study had several limitations. The data were collected retrospectively from the registry system and we were not able to obtain complete data on all of the patients. We described the complications and treatment approaches during a median follow-up period of 5 years. Urological complications were noted such hydronephrosis and urinary incontinence (4/10). Unfortunately, there was no more details on nutrition, growth and mental development in our registry. It has been reported the long-term morbidities caused by NDI including primary nocturnal enuresis (44%), persistent small stature (38%), urologic complications (37%), persistent failure to thrive (29%), and CKD stage 2 or greater (30%) 23 . Here we reported only on case with delayed diagnosis who developed into CKD 3 stage.

Conclusion
New-born and young children with polyuria symptoms should be immediately referred to specialized centers with experience in treating hypernatremic dehydration and the ability to obtain rapidly a clinical diagnosis and genetic study.

Consent for publication
The parents of the children and adult patients described in this article provided consent for participation in the study and for publishing the obtained results. bio-information evaluation and protein structural analysis; all the authors contribute to the clinical information and registry database; JR, HX, DM critically revised the manuscript Figure 1 Schematic representation and protein structure of AVPR2 and AQP2 protein with location of mutations. A. AVPR2, which belongs to the G-protein-coupled receptor (GPCR) superfamily, is formed by 371 amino acids with seven transmembrane, four extracellular and four cytoplasmic domains, and is present at the basolateral membrane (UniProtKB P30518-1, P30518-2)14. The seven transmembrane domains with the extracellular NH2 terminus and its intracellular COOH terminus are illustrated as originally reported. Ribbon diagram of AVPR2 from PDB entry 6U1N was shown with GPCR-arrestin structure in lipid bilayer.
The six mutations identi ed in our NDI cohort were indicated in red dotted box, and the diagnostic mutations by HGMD were labeled in orange diamond. B. AQP2, which belongs to the larger family of major intrinsic proteins, is formed by 271 amino acids. Aquaporins contain two tandem repeats, each containing three transmembrane α-helices domains with the amino and the carboxyl termini located on the cytoplasmic surface of the membrane (UniProtKB P41181)16. The six transmembrane domains with the intracellular NH2 and COOH terminus are illustrated as originally reported. Ribbon diagram of AQP2 from PDB entry 4NEF was shown with side and top view. The two mutations identi ed in our NDI cohort were indicated in red dotted box, and the diagnostic mutations by HGMD were labeled in orange diamond. C. A violin plot revealed the domain composition of AVPR2 with the location of the 109 population variants (SNPs) from gnomAD (turquoise square) and the 169 diagnostic mutations from HGMD (peachpuff circle). The seven transmembrane regions were labelled as yellow box. There was a signi cant enrichment of diagnostic mutations in the transmembrane regions (Fisher test, P=0.001). D. A violin plot revealed the domain composition of AQP2 with the location of the 70 population variants (SNPs) from gnomAD (turquoise square) and the 62 diagnostic mutations from HGMD (peachpuff circle) including 56 mutations inherited by autosomal recessive trait (AR, orange circle) and 6 mutations inherited by autosomal dominant trait (AD, scarlet dot). The eight transmembrane regions were labelled as yellow box. There was no signi cant enrichment of diagnostic mutations in transmembrane region compared with population variants (Fisher test, P=0.079). The mutations by AD were more likely located in the C terminal region compared with that by AR (Pearson test, P=0.0001).

Figure 2
Analysis of variants and residue depth computation9 in AVPR2 and AQP2. A. A 2D plot of residue-wise depth of AVPR2 (PDB 6U1N) with bars indicating the diagnostic mutations by HGMD (orange), population variants/SNPs by gnomAD (turquoise) and all the other residues (blue). B. Boxplot of the residue depth comparing in the diagnostic mutations (orange), population variants/SNPs (turquoise) and all the other residues (blue) in AVPR2. * P= 2e-8, ** P = 2e-5 (Wilcoxon-Mann-Whitney test) C. A 2D plot of residue-wise depth of AQP2 (PDB 4NEF) with bars indicating the diagnostic mutations by HGMD (orange), population variants/SNPs by gnomAD (turquoise) and all the other residues (blue). D. Boxplot of the residue depth comparing in the diagnostic mutations (orange), population variants/SNPs (turquoise) and all the other residues (blue) in AQP2. * P= 2e-7, ** P = 2e-5 (Wilcoxon-Mann-Whitney test) Figure 3 Clinical treatment and outcome in 10 children with NDI. A. Simpli ed tonicity balance for the neonate ID#3 with NDI, excreting hypotonic urine and receiving 0.19% saline. When considering separately the balances for water and Na, the excess Na administration from 0.19% saline becomes immediately obvious. The red box in the middle represents the total body water compartment of a patient with NDI. A patient with NDI excretes hypotonic urine, detected with the urine Na concentration of 12 mmol/l. If 1 L of urine output is replaced with 1L of 0.19% saline, this will not change the uid balance, but lead to a net gain of 20.5 mmol of Na. In the neonate of 3.5 kg with estimated 2.8L of total body water, this would lead to an increase in the plasma Na concentration of approximately 20.5 mmol / 2.8L=7.3 mmol/l. B. Serum sodium and treatment in the case #3. C. Follow up of the serum sodium level in ten patients with treatment of thiazides and indometacin.