Assessing utility of clinical exome sequencing in diagnosis of rare idiopathic neurodevelopmental disorders in Indian population

Background: Neurological diseases are phenotypically and genotypically heterogeneous. Clinical exome sequencing (CES) has been shown to provide a high diagnostic yield for these disorders in the European population but remains to be demonstrated for the Indian population. Methods: A cohort of 19 idiopathic patients with neurological phenotypes, primarily intellectual disability and developmental delay, were recruited. CES covering 4620 genes was performed on all patients. Candidate variants were validated by Sanger sequencing. Results: CES in 19 patients provided identified 21 variants across 16 genes which have been associated with different neurological disorders. Fifteen variants were reported previously and 6 variants were novel to our study. Eleven patients were diagnosed with autosomal dominant de novo variants, 7 with autosomal recessive and 1 with X-linked recessive variants. CES provided definitive diagnosis to 10 patients, hence the diagnostic yield was 53%. Conclusion: Our study suggests that the diagnostic yield of CES in the Indian population is comparable to that reported in the European population. CES together with deep phenotyping could be a cost-effective way of diagnosing rare neurological disorders in the Indian population.


Background
The brain is an incredibly complex organ consisting of a plethora of interconnected cell types. During development and day-to-day functioning, a variety of neurons and numerous proteins are required in the right amount at the right place and at the right time. Hence, any pathogenic mutation affecting genes that are involved in the production of these proteins can have consequences on brain development and functioning. Neurological disorders are conditions in which the motor, sensory, and cognitive functions decline due to variation(s) in the genotype of one or more genes involved in the functions of neurons, spinal cord, and peripheral nerves [1]. They are ranked as the leading disorder to cause 10.2% of global disability-adjusted-life-years compared to 7.3% in 1990, and second-leading disorder responsible for 16.8% of the global deaths [2,3]. These disorders are clinically heterogeneous and genetically diverse group of disorders affecting all age group with sporadic (autosomal dominant) or acquired (autosomal or X-linked recessive) inheritance. This makes the diagnosis more challenging using the conventional techniques and often deprives the patient of proper treatment. Furthermore, genetic counselling during pregnancy becomes challenging when families approach for prenatal diagnosis.
An extensive genotypic overlap amongst a range of neurodegenerative disease involving neuropathy, myopathy, epileptic encephalopathy, ataxia, paraphasia, intellectual disability, and sensory impairment hampers the genetic diagnosis and makes it difficult to target specific genes for study [4].
For instance, EpilepsyGene, designed in 2015, includes 499 genes and 3931 variants associated with epilepsy [5]. Moreover, with respect to intellectual disability, 528 confirmed genes and 628 candidate genes are reported . [6] Different variations in a single gene can lead to different clinical entities in neurological disorders. An ideal example is of a variation in LIS1 gene (also called as PAFAH1B1: platelet activating factor acetylhydrolase 1b regulatory subunit 1 gene). A small deletion in LIS1 gene leads to an isolated lissencephaly sequence. However, a large deletion covering LIS1 gene and neighbouring genes causes Miller-Dieker syndrome [7]. Both the diseases exhibit similar clinical traits except for the facial dysmorphism, as observed only in Miller-Dieker syndrome [8]. Thus, identification of the genetic cause of a neurological disease using conventional techniques like polymerase chain reaction (PCR) and Sanger sequencing (one-loci-at-a-time approach) seems true in the diseases with well-established genotypic-phenotypic correlation. However, many neurological disorders, even after a meticulous evaluation, remain undiagnosed due to the presence of mild or unusual traits, and lack of precise molecular basis. Sanger sequencing approach is laborious, expensive and falls short in detecting nucleotide repeat expansions, large indels, or copy number variations.
Clinical and whole exome sequencing (CES and WES) has facilitated clinical utility in identifying and characterizing the genes and variants involved in the clinical presentation of the idiopathic neurological disorders because of their ability to sequence multiple genes at a time in a cost-effective manner. With the application of WES, the diagnosis rate in clinical practice has increased which helps in early diagnosis, prognosis, reproductive counselling, medical management and prenatal diagnosis [9]. Furthermore, it is useful in the diagnosis of the etiologically misleading neurological disorders that give false positive result using conventional techniques.
However, most of the studies carried out to date has been in the white European population and western healthcare settings. The diagnostic utility and cost-effectiveness of WES and CES is currently unknown for the Indian population. Since the goal of introducing any technique in clinical practice is to maximize diagnostic yield and minimize cost to the patient, our aim was to assess the utility of CES in diagnosing rare idiopathic neurological disorders in a cohort of 19 patients from the Indian population.

Patient recruitment
The present study comprised of 19 unrelated idiopathic patients with variable neurodevelopmental phenotypes including intellectual disability and developmental delay that were referred between 2015 and 2017. All patients were evaluated according to the clinically validated developmental scales by their referring clinicians. Each referring physician provided details on their phenotype, particularly:

DNA extraction
Four milliliters of whole blood drawn from each patient in an EDTA vacutainer was used for the genomic DNA isolation using the salting-out technique [15]. DNA was quantified using QIAxpert

Clinical presentation
Out of 19 patients in our cohort, 12 were males and 9 were females, with the mean age at the time of study being 10.8 years (range 9 months to 45 years) ( Figure 1). Recurrent features included global developmental delay, intellectual disability and muscle abnormalities such as hypotonia, walking difficulty and poor control of neck and body in these patients ( Of interest, out of the 5 novel mutations identified in the current study, 1 was classed as pathogenic (SCN2A c.1153delT), 2 as likely pathogenic (SCN1A c.5351T>A and BSCL2 c.461C>T) and 2 as variants of unknown significance (PGAP1 c.2286+5G>A and AFG3L2 c.1951A>G). Furthermore, segregation analysis of PGAP1 c.2286+5G>A variant showed parents to be heterozygous carriers and affected siblings to be homozygous for the variants. However, due to the absence of parental DNA for the patient with AFG3L2 c.1951A>G mutation, segregation analysis was not carried.
Overall, out of 21 variants, 4 were classed as pathogenic, 9 as likely pathogenic and 8 as variant of unknown significance according to the ACMG-AMG guidelines. With the clinical exome sequencing approach, we were able to provide definitive diagnosis to 10 patients; hence, our diagnostic yield with this approach was 53%.

Discussion
In the current study, our aim was to assess the utility of clinical exome sequencing in the diagnosis of rare neurological disorders in India. Exome sequencing of 19 patients with intellectual disability and/ or developmental delay provided confirmed diagnosis of 10 patients, whereby, ~50% of the mutations were of de novo origin. The study also elucidated 15 rare diseases that were diagnosed in these patients that would have otherwise been difficult to diagnose with cheaper but lower-resolution orthogonal methods such as microarray and karyotyping.
Eleven patients were identified carrying autosomal dominant de novo mutations, which is a known disease mechanism in rare neurological disorders [26,27]. Furthermore, 8 out of 11 variants have previously been reported in other studies, thereby, further strengthening the evidence for the role of these variants in causing respective diseases. It is noteworthy that out of these 8 known de novo variants, 50% were missense variants and the remaining were either splice site or nonsense mutations. This finding has direct implication on genetic counselling whereby, de novo missense mutations can be associated with incomplete penetrance, whereas, nonsense and splice site mutations are not, as shown for SCN1A gene [28].
Despite finding known disease associated variants in majority of the patients, our study identified 5 novel variants in 5 genes-SCN1A, SCN2A, PGAP1, AFG3L2 and BSCL2. Diseases associated with these genes include Dravet syndrome (OMIM#607208), early infantile epileptic encephalopathy type 11 (OMIM#613721), mental retardation type 42 (OMIM#615802), spastic ataxia type 5 (OMIM#614487) and hereditary spastic paraplegia (OMIM#270685), respectively. Whilst 3 of the 5 variants are classed as pathogenic or likely pathogenic according to the ACMG-AMG classification [29], they are to be interpreted with caution as these variants would require replication in other patient and control cohorts as well as functional follow-up to implicate them as disease causing [30]. None of the novel variants identified in the current study had an autosomal recessive or X-linked recessive inheritance pattern. This suggests an intriguing hypothesis of a reduced probability of finding novel recessive genes compared to dominant genes in neurological diseases in the Indian population; one that is supported by the data available from studies in the European population [27].
Interestingly, the diagnostic yield of clinical exome in our cohort was 53%, which is in concordance with the published literature [27]. However, this needs to be placed in contrast with the role of de novo copy number variants (CNVs) that also play role in neurological disorder pathogenesis [27].
Genomic microarray-based studies have shown a strong correlation between the number of genes affected by a CNV and phenotypic severity [27,31]. Indeed, microarray-based studies have shown presence of rare, autosomal dominant form of de novo CNVs in approximately 10% of patients [27].
Whilst microarray has been the mainstay for detection of CNVs, exome sequencing based large CNV detection (>400 kb) are increasingly becoming prominent in diagnosing neurological disorders [32].
Furthermore, it is estimated that 45-60 de novo single nucleotide variants occur per genome per generation whereas the frequency of de novo >500kb CNVs is approximately 0.01 per genome per generation [33,34]. This difference in mutation rates together with difference in mutation detection abilities could explain an enhancement in the diagnostic yield of exome sequencing by 24-33% over microarray [27]. Therefore, utilization of an exome sequencing technique in identifying de novo variants (both SNV and CNV) compared to microarray-based approach in identifying only de novo CNVs in neurological diseases is likely to be an attractive approach.
Whilst the current study highlights several benefits of using an exome sequencing based approach in diagnosing neurological diseases, there are some caveats which needs to be highlighted. First, the diagnostic yield of 53% in our study could be misleading if taken at face value. Indeed, overall diagnostic yield has been reported between 50-70% in diagnosing moderate to severe intellectual disability diseases [27], depth and quality of patient phenotyping can impact diagnostic yield [35].
The current study carried out an in-depth patient phenotype which may have aided in interpreting genotype data and disease diagnosis. Second, 11 patients in whom de novo SNVs were identified, Sanger validation for the mode of inheritance confirmation wasn't carried out due to the unavailability of parental samples. Without Sanger sequencing confirmation in parental samples, it is conceivable that these variants may have been inherited from one of the parents. However, since these disorders have a significant impact on patient's fitness [27], it is unlikely for either of the parents to be a carrier of these mutations. Hence, despite the absence of parental samples, replication of variants from the literature together with heterozygous status in the patient's sample suggests these variants to be likely of de novo origin. Third, the current study had a small sample size compared to the large multicenter project like Deciphering Developmental Disorders (https://www.ddduk.org). However, the study was aimed to assess the utility of clinical exome sequencing in the Indian population rather than identification of novel genes and pathways involved in neurological disorders, hence had a requirement for a small sample size.

Conclusions
Genetic studies have significantly improved in the past decade and consequently, there has been a substantial improvement in the diagnosis of neurological disorders. Due to the phenotypic and genetic heterogeneity of neurological disorders, it is required to carry out hypothesis-free exome-wide approaches as a first-tier diagnostic test. Even with the current lack of knowledge around all

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.