Detection of Clinically Relevant Copy Number Variations and Genes in a Bangladeshi Cohort of Neurodevelopmental Disorders

Background: Copy number variations (CNVs) play a critical role into the pathogenesis of neurodevelopmental disorders (NDD) among children. In this study, we aim to identify clinically relevant CNVs, genes and their phenotypic characteristics in an ethnically underrepresented homogenous population of Bangladesh. Methods: We have conducted genome-wide chromosomal microarray analysis (CMA) for 212 NDD patients with male to female ratio of 2.2:1.0 to identify rare chromosomal abnormalities (deletion /duplication/ rearrangements). To identify candidate genes within the rare CNVs, multiple gene constraint metrics (i.e. “Critical-Exon Genes (CEGs)”) were applied to the population data. Autism Diagnostic Observation Schedule-Second Edition (ADOS-2) was followed in a subset of 95 NDD patients to assess the severity of autism and all statistical tests were performed using R package. Results: In our cohort, the head circumference of males are signicantly greater than females (p=0.0002). Of all samples assayed, 12.26% (26/212) and 47.17% (100/212) patients carried pathogenic and variant of uncertain signicance (VOUS) CNVs, respectively. 2.83% (6/212) pathogenic CNVs are located at the subtelomeric regions. Further burden test identied females are signicant carriers of pathogenic CNVs in comparison to males (OR=4.2; p=0.0007). ADOS-2 subset show severe social communication decit (p=0.014) and overall ASD symptoms severity (p=0.026) among the patients carrying duplication CNV compared to the CNV negative group. Candidate gene analysis identied 153 unique CEGs in pathogenic CNVs and 31 in VOUS. Of the unique genes, 18 genes were found to be in smaller (<1 MB) focal CNVs and identied PSMC3 gene as a potential candidate gene for Autism Spectrum Disorder (ASD). Moreover, we hypothesized that KMT2B gene duplication might be associated with intellectual disability. Conclusion: Our results show the utility of CMA for precise genetic diagnosis and its integration into the diagnosis therapeutics and management of NDD patients. analyzed a cohort of 212 NDD patients of Bangladesh that underwent microarray testing to identify copy number variations (CNVs) from 2017 to 2020 for diagnostic purpose. To our knowledge, this is a rst cohort of NDD patients reporting a signicant number of clinically relevant variants and genes from Bangladeshi population. 102] broader NDD gene CNTN6 gene. We have found another two unrelated ASD patients carrying around 130Kb duplication CNV disrupting previously reported [103] NDD gene NPHP1. Three unrelated patients (#57, #159 and #200) carrying 442Kb to 691Kb duplication CNVs with a 236Kb overlapping region disrupting two constraint genes NPY4R and GPRIN2. In addition, we have found two unrelated ASD patients (#65 and #99) carrying 51Kb and 59Kb duplication at 4p16.3 disrupting Alpha-L-iduronidase, IDUA gene.


Introduction
Neurodevelopmental disorders (NDDs) are a group of developmental de cits that disrupt normal physiology and function of the brain. These disorders are referred as a collection of early onset conditions that include autism spectrum disorders (ASDs), intellectual disability (ID), epilepsy encephalopathy, attention de cit hyperactive disorders (ADHD), obsessive compulsive disorder (OCD), and cognitive skills disorders [1][2][3][4][5]. Such disorders, when isolated, are termed nonsyndromic; when associated with the presence of dysmorphisms or apparent congenital anomalies (CA), are termed syndromic [6]. The incidence of DD/ID is 3% in the general population [7] while the statistics from USA shows that ASD affects 1 in 54 live births [8]. Individuals affected with NDDs usually present reduced adaptive skills, limited intellectual ability, motor di culties, CA and problems with social interaction. Phenotypically, there are major overlaps among ASDs with epilepsy encephalopathy, ADHD, Fragile X syndrome (FXS), motor abnormalities and intellectual disability [9,10].
The etiology of NDDs is principally genetic. Advancement of genomic techniques such as high-resolution microarrays and next-generation sequencing have yielded signi cant insights into the genetic etiology of NDDs [11]. For decades, structural genomic variation known to be a major contributor into the etiology of a proportion of children diagnosed with NDDs [12][13][14]. In the last decade, many large international genomic consortiums have pro led NDD cases mostly from European ancestry to identify genomic alterations and NDD associated genes. More than 100 genes and genomic loci [15] have been consistently found to be involved in the etiology of NDDs. Studies based on ASD cohorts have identi ed an increased burden of rare genic copy-number variations (CNVs) and have characterized rare, usually de novo, recurrent CNV loci that are thought to contribute to the genetic risk [16]. Speci c genes within these CNV regions that are implicated in the etiology of ASD and other NDDs include SHANK3, SYNGAP1, NRXN1, GRM7 and DLGAP2, etc. [17][18][19][20][21]. As the number of candidate genes and loci have increased, a striking recurrence of candidates identi ed in multiple disorders has been uncovered, which may account for a proportion of the signi cant comorbidity that has been noted among neurodevelopmental disorders [22,23]. The availability of microarray related technologies and the contribution of structural variations to NDD enabled whole genome chromosomal microarray (CMA) as one of the rst tier diagnostic tests for NDD cases in developing countries. In 2010, Miller et al. demonstrated the utility of CMA as a rst-tier clinical diagnostic test to enable early diagnosis of individuals with NDDs [24].
In Bangladesh, autism is referred as a great economic burden indicating a signi cant health problem. A community based study reported the increase in incidence of ASD from 0.2 (2005) to 0.84 (2009) per 1000 children in Bangladesh [25,26]. In a well characterized NDD suspected cohort, a gold standard observational assessment tool ADOS-2 [27] con rmed 73.85% (209/283) ASD cases rest 26.15% (74/283) are broader NDD cases [28]. The prevalence of NDDs including cerebral palsy, developmental delay and ASD among rural community is 5.6/1000, 2.6/1000 and 0.75/1000etc [29]. Here, the diagnosis of NDDs is mostly done by clinical conditions of the patients and psychological assessment tools like (DSM-IV & ADOS/ADOS2) [28,30]. Due to overlapping and complex clinical presentation of NDDs, it is di cult to con rm diagnosis by psychological assessment. Therefore, early diagnosis of NDD cases in children may lead to better outcomes through expeditious educational planning and therapeutics [31]. In Bangladesh, the genetic cause of breast cancer, intellectual disability and rare diseases were uncovered by whole genome sequencing [32], whole exome sequencing [33], targeted sequencing [34,35], whole genome microarray [36] and quantitative PCR [37]. But there is no comprehensive genetic study with large NDD cohort. Our study analyzed a cohort of 212 NDD patients of Bangladesh that underwent microarray testing to identify copy number variations (CNVs) from 2017 to 2020 for diagnostic purpose. To our knowledge, this is a rst cohort of NDD patients reporting a signi cant number of clinically relevant variants and genes from Bangladeshi population.

Cohort description
The cohort comprised of 212 neurodevelopmental disorder patients with autism spectrum disorders (ASDs), developmental delay (DD), seizures/epilepsy, intellectual disability (ID), syndromic features, psychiatric or behavioral issues, hypotonia, speech and language disability, attention de cit hyperactivity disorder (ADHD). Almost all the patients have more than one phenotype. The age of the patients ranged from 9 days to 31 years with 54.25% (115/212) categorized in the range of 1-5 years while 3.8% (8/212) patients were of less than 1 year of age (Table 1). Around 68.40% (145/212) of the cohort were males. Of the 212 patients, ten (7 male and 3 female) were not included in the statistical analysis of age, weight, OFC and BMI as it was not possible to obtain corresponding data (Table 1 and Table 2). Moreover, a subset of patients (n=95) were evaluated for autism spectrum disorders by ADOS-2 method. Among these 95 participants (70 male, 25 female), 71meet the criteria of autism positive (51 male, 20 female) and the remaining 24 were classi ed as autism negative. Besides the 71, there were more 24 autism patients who were diagnosed by different assessment tool DSM-V (n=14) and ADOS (n=10). Therefore, in total there are 95 ASD patients (44.81%=95/212) diagnosed by different types of psychological assessment tools, DSM-V (n=14), ADOS (n=10) and ADOS-2 (n=71), with the male to female ratio of 2.65:1 ( Table 2) and age ranging from 1.5 years to 19 years. We have also collected occipito frontal head circumference, measured in centimeters using a non-stretchable plastic tape and body weight measured in kilograms using a calibrated weight machine.

Ethics statement
The study protocol was approved by Institutional Review Board of Holy Family Red Crescent Medical College and Hospital. Before participant enrolment, written informed consent including the use of peripheral blood and clinical data for research use and publication was obtained from the parents.

ADOS-2
Autism Diagnostic Observation Schedule, Second Edition (ADOS-2) [27] is a semi-structured, standardized assessment of communication, social interaction, imaginative use of materials, and restricted and repetitive behaviors. ADOS-2 is a gold standard observational assessment for diagnosing ASD. ADOS-2 has ve modules and each module offers standard activities designed to elicit behaviors that are relevant to the diagnosis of ASD at different chronological ages and language abilities. For the present study, autism characteristics were measured using Toddler module, Module 1, and Module 2. Each module uses communication, reciprocal social interaction, and restricted and repetitive behaviors to generate a total score. Elevated scores classify an individual in the autism spectrum or autism diagnostic range based on the severity or the frequency displayed.
DNA extraction and quantitation DNA was extracted from peripheral blood sample using ReliaPrep™ Blood gDNA isolation kit (Promega, USA) followed by the protocols provided by the manufacturer. Extracted DNA samples were checked for the quality using Nanodrop Spectrophotometer. Then samples were electrophoresed on agarose gels and samples with intact genomic DNA showing no smearing on agarose gel were selected for experiment. Intact genomic DNA was diluted to 50 ng/µL concentration based on Quant-iTPicogreen (Invitrogen) quantitation. The whole-genome ampli cation process requires 200ng of input gDNA.

Whole-genome microarray
We conducted genome-wide microarray to identify chromosomal abnormalities (deletion/duplication/translocation and rearrangements) and investigated the changes in uorescence intensities between the test specimen and the controls. Illumina Global Screening Array-24+ v1.0 was used applying illumine SNP genotyping technology and Illumina cnv Partition 3.2.1 plug-in of GenomeStudio was used to detect chromosomal abnormalities. This microarray uses 642,824 probes spread across the genome to detect genetic abnormalities (includes >60 loci in DECIPHER database reported for neurodevelopmental disorders) greater than 30kb and also targets sub-telomeric regions that are vulnerable to chromosomal abnormalities. We have used rigorous multiple algorithmic techniques (MathLab and Java) and manual curation of the data to pinpoint genomic variation based on the normalized log2 intensities of the probes. Our algorithm excluded all common CNVs found in the in house (NeuroGen) control population samples (9689 samples) from analysis and only analyzed rare CNVs to infer their contribution to human diseases. Digestion, ligation, PCR, labeling, hybridization, and scanning were performed following the standard protocols.

Results
The cohort comprised of 212 neurodevelopmental disorder patients with male to female ratio of 2.2:1.0 (Table 1). Among 212, 202 were included for descriptive statistical analysis and showed signi cant differences between the male and female groups with respect to weight (p=0.001), head circumference (p=0.0002) and BMI (p=0.001) ( Table 1 and Figure 1). Patients carrying rare CNVs (pathogenic/VOUS) and patients carrying no rare CNVs are de ned as CNV positive and CNV negative group respectively. Signi cant differences with respect to head circumference (p=0.0001), weight (p=0.002) and BMI (p=0.007) were also found in the male CNV positive group compared to female CNV positive group (Table 1, Figure S1) while no signi cant differences were observed in terms of head circumference and weight in the CNV negative groups (Table S2). It was observed that females are the substantial carriers of pathogenic CNV than males (OR=4.2; p=0.00077) (Figure 3d). In this cohort, 95 patients were evaluated for autism spectrum disorders by ADOS-2 method and 71 met the ADOS-2 cut-off criteria of autism positive (51 male, 20 female, male/female ratio 2.55:1) while the remaining 24 were classi ed as autism negative (non-spectrum). The entire cohort (p=0.0096) and subset of ASD positive cases (p=0.019) show head circumference of males was signi cantly greater than females (Table 2, Figure 2a and 2b) while no signi cant differences were observed among male and female in terms of age, weight, height, BMI, social affect score, and ADOS-2 total score. We also observed that patients carrying duplication CNV showed severe social communication de cit (p=0.014) and overall ASD symptoms severity (p=0.026) compared to the CNV negative group (Table 2, Figure 2d and 2e). Moreover, a trend of increased number of CNVs in autism patients was observed in comparison to non spectrum individuals (OR=2.29; p=0.06) (Figure 2c). The detailed statistical analysis of the subset was embedded in Table S3.   (Table 3). All the CNVs in these regions were further con rmed by ddPCR (Table S4 and Figure S2). In this cohort, 2.83% (6/212) and 20.28% (43/212) patients were carrying pathogenic and VOUS subtelomeric CNVs respectively (Table 3 and Table S1). Of the pathogenic 26 samples, one patient was carrying double terminal deletion impacting chromosome 18 (Table 3) and 11 along with a pathogenic CNV also presented VOUS (Table S5). The 27 pathogenic CNVs comprised of 16 deletions, and 11 duplications (Table 3 and Figure 3C). The average length of deletion and duplication are 5365.26kb and 17451.72kb respectively and the highest frequency group for pathogenic CNV deletion and duplication are 30-2000kb and >20000kb respectively (Figure 3e). To exclude false CNV calls, we have randomly chosen 9 pathogenic CNV for ddPCR validation and yielded (8/9) 88.89% validation rate (Table S4 and Figure S2).

Discussion
In this study, we have found signi cantly greater head circumference in the male patients compared to female patients in the overall NDD and autism positive patients. This is a replication of previous studies in other ethnically diverse population that also shown similar association with abnormal acceleration of head growth among children of ASD compared to neurotypical children [57][58][59]. This observation hints that aberrant brain cell proliferation may be a key neurobiological mechanism in the disorder [60]. Some studies also suggest that certain mutations underlying neurodevelopmental disorders may also lead to changes in brain volume, microcephaly or macrocephaly [61, 62]. Signi cant difference in head circumference was not observed between CNV positive and negative group as well as CNV deletion and duplication group (Table S2 and S3). But larger head circumference and weight were found in male carrying rare CNV compared to female carrying rare CNV (Table 1, Figure 1). This is the rst Bangladeshi NDD cohort conducted whole genome microarray analysis using GenomeArc annotation tool and identi ed a diagnostic yield of   (Table 3).
Identifying overlapping genes and pathways across disorders is critical to improve the understanding of their potential shared genetic etiology. Gene Ontology and KEGG pathway enrichment analysis of the impacted genes within the CNV breakpoints of all pathogenic deletions and duplication identi ed "ubiquitin-like protein transferase activity (GO:0019787)","vesicle-mediated transport in synapse (GO:0099003)", "dependent protein catabolic process (GO:0030163)", "regulation of neuron death (GO:1901214)" and synaptic signaling (GO:0099536)" pathways to be highly signi cant ( Figure 5A and 5B). Aberrations in autophagy (a major cellular catabolic process) related signaling and mutations in autophagy related genes have been implicated in several neurodevelopmental disorders including autism, Tuberous sclerosis, Fragile X syndrome and Neuro bromatosis type 1 [74,75]. Loss of function in the ubiquitin ligase gene HERC2 has been associated with severe neurodevelopmental phenotype [75]. Mutations in presynaptic genes have been liked to various neurodevelopmental disorders including autism, intellectual disability and epilepsy [76] and synaptic signaling has been identi ed as one of the principle molecular pathways affected in neurodevelopmental disorders [77]. Perturbations in the apoptotic signaling pathway have also been identi ed in various NDDs including autism, Fragile X syndrome, and schizophrenia [78,79].
Analysis of "critical-exon"method[38, 49] to identify constraint candidate genes have found high number of CEGs within pathogenic variant (7.5) compared with VOUS (0.19) (p=0.0002). More CEGs per pathogenic CNV was also previously reported[80] and re ective of length bias and gene density of pathogenic variants. CEG analysis of short focal CNVs identi ed 18 unique CEGs that are highly expressed in brain and have low burden of non-synonymous variants.
The role of these genes within the context of neurodevelopmental disorders and our cohort, we have conducted comprehensive literature search. For example, in our cohort, one autism patient (#20) was carrying a 476 kb pathogenic deletion disrupting PLCB1 gene. Girirajan et al. found an enrichment of microdeletions and duplications involving the PLCB1 gene in individuals with autism [81]. Rest 17 patients were carrying VOUS. Of the 17, we were trying to nd out a common genetic breakpoint shared by multiple patients with same clinical condition to identify candidate gene for the disrupted locus. In this cohort, we have found 2 autism patients (#119 and #121) that harbored common deletion breakpoint disrupting 4 genes SLC39A13, PSMC3, SPI1 and RAPSN including one critical exon gene, PSMC3. Autosomal recessive mutations in the SLC39A13 gene is associated with a well de ned disease Ehlers-Danlos syndrome, spondylodysplastic type, 3 (OMIM# 612350). Mutation in the SPI1 gene was previously reported in acute lymphoblastic leukemia [82]. Autosomal recessive mutation in another gene, RAPSN, is associated with two other well developed diseases, Fetal akinesia deformation sequence 2 (#601592) and Myasthenic syndrome, congenital, 11, associated with acetylcholine receptor de ciency (#616326). The rest one gene of the common breakpoint is PSMC3. PSMC3 encodes the 26S regulatory subunit 6A also known as the 26S proteasome AAA-ATPase subunit (Rpt5) of the 19S proteasome complex responsible for recognition, unfolding and translocation of substrates into the 20S proteolytic cavity of the proteasome [83]. This is suggestive of PSMC3 plays an essential role in the ubiquitin-proteasome system (UPS) that includes morphogenesis, dendritic spine structure, synaptic activity, and the regulation of  (Figure 6c, 6d and Table S7). Although our patients did not show syndromic appearance at the age of 1.6 and 3.1 years except ASD we hypothesized on the basis of previous studies in UPS [84][85][86][87] and decipher [66] data that heterozygous mutation in the PSMC3 (size ~7.7kb) gene might be associated with NDDs without causing neurosensory syndrome.
In this study, we have found a 20 year old girl (#106) born in a consanguineous marriage with healthy parents and was delivered at preterm after an eventful pregnancy (IUGR). She had the history of delayed development. She had intellectual disability with dismorphic features of elongated face, long ngers (Table   S1). Her mental and physical conditions were progressively worsening from the age of 15 years. At the age of 18 years, she developed dyskinesia and swallowing di culty. From this age, she was unable to walk independently or talk clearly. Her MRI nding was normal at this age. Analyzing the overlapping region of fourteen previously reported deletions (length 0.19 to 4.91 Mb) [55,[90][91][92][93][94] and three duplications (length 3.31 to 12.63 Mb ) [90], our patient (#106) contains a shorter 2.01Mb duplication with a 38.4kb (chr19:35,700,296-35,738,700) common overlapping region among the CNVs disrupting 2 genes, ZBTB32 and KMT2B (Figure 7 and Table S8). Further, CEG and GenomeArc analysis also identi ed UBA2, USF2, SCN1B, KMT2B, COX6B1, LGI4 and ZNF599. SCN1B is the most interesting gene known to be associated with Atrial brillation, familial, 13 (#615377), Brugada syndrome 5 (#612838), developmental and epileptic encephalopathy 52 (#617350) and epilepsy with febrile seizures plus, type 1 (#604233) when get abnormal due to pathogenic mutations. Our patients had no history of seizures or epilepsy or any cardiac problems that indicate that SCN1B duplication might not be associated with our patient's clinical condition. KMT2B is another interesting gene within this breakpoint and pathogenic mutations are associated with Dystonia 28, childhood-onset (#617284) which core phenotypes are described as limb-onset childhood dystonia that tends to spread progressively, resulting in generalized dystonia with craniocervical involvement. Co-occurring signs such as distinct facial dysmorphism and intellectual disability are most common [54,55]. There is a distinct group of KMT2B patients presenting with a neurodevelopmental disorder in the absence of dystonia or related movement disorder [52,54,56]. We have also found some Decipher [90] patients carrying duplication CNV containing KMT2B gene whose common phenotype was GDD in the absence of dystonia. Most of the clinical conditions of our patient (#106) in the form of global developmental delay, intrauterine growth retardation, intellectual disability, facial dysmorphism, dyskinesia, swallowing problem, walking and talking problem match with the KMT2B related disorder. That's why we hypothesize that KMT2B gene duplication might be associated with the KMT2B related disorder. Although KMT2B haploinsu ciency due to frameshift (small insertion/deletion), nonsense, splice-site, missense and large deletion mutations are the primary cause of disease mechanism [54][55][56] but it is also reported that the penetrance for KMT2Brelated disease is high with almost complete penetrance for protein-truncating variants and chromosomal deletions, and reduced penetrance for missense variants [54,56]. A report from China showed that DYT-KMT2B and KMT2B-related neurodevelopmental disease without dystonia can occur even within the same family [95]. From the current knowledge of KMT2B related studies we hypothesize that patients with KMT2B duplication variants had reduced penetrance for KMT2B related disorders than those with chromosomal deletions and loss of function mutations [54,56].Within the common overlapping region, ZBTB32 is another new candidate gene in our cohort with no previous report of association with dystonia or neurodevelopmental disorder patients. In our patient we have also found another 891Kb VOUS which disrupt the most important gene FTL. Pathogenic mutations in this gene are associated with neurodegeneration with brain iron accumulation 3; NBIA3 (OMIM#606159) or Hyperferritinemia with or without cataract; hrftc (OMIM#600886). NBIA3 is characterized by progressive iron accumulation in the basal ganglia and other regions of the brain, resulting in extrapyramidal movements, such as Parkinsonism, dystonia and dyskinesia. Age at onset is variable 13 to 63 years [96,97]. No cavitation of the basal ganglia or any evidence of iron deposition was found in the MRI report of our patient at the age of 18 years (report normal nding). Although our patient has no con rmed pathology ndings of NBIA3 or cataract, it is not possible to exclude FTL duplication association with the clinical conditions of our patient due to variable age of onset of the disease.
In our cohort, a series of overlapping rare clinically relevant variants have been identi ed in multiple patients. For example, a 96.8 Kb deletion was found in two unrelated ASD patients (#8 and #127) at 6p21.33 that disrupts major histocompatibility complex (MHC) class I gene MICA that was not previously associated with broader neurodevelopmental disorder patients. Another three unrelated patients (#13, #15 and #207) carrying around 48Kb duplication at 15q13.3 that disrupt the previously reported [98-100] broader NDD genes CHRNA7 and OTUD7A. We have also found a 381Kb duplication in two unrelated ASD patients (#34 and #104) at 8q21.2 that disrupts Carbonic anhydrases II, CA2 gene associated with the disease osteopetrosis, autosomal recessive 3; OPTB3 (OMIM# 259730). We have found two siblings (#42, #43) affected with variable NDD phenotypes and one unrelated patient (#152) carrying 233 kb to 340 kb duplication at 3p26.3 disrupting previously reported [101,102] broader NDD gene CNTN6 gene. We have found another two unrelated ASD patients carrying around 130Kb duplication CNV disrupting previously reported [103] NDD gene NPHP1. Three unrelated patients (#57, #159 and #200) carrying 442Kb to 691Kb duplication CNVs with a 236Kb overlapping region disrupting two constraint genes NPY4R and GPRIN2. In addition, we have found two unrelated ASD patients (#65 and #99) carrying 51Kb and 59Kb duplication at 4p16.3 disrupting Alpha-L-iduronidase, IDUA gene.

Conclusion
In this paper, we have shown the utility of whole genome microarray as a rst tier diagnostic technology for neurodevelopmental disorders patients. Without a proper genetic test, the clinical complexity alone may not enough to identify the cause and often lead to a diagnostic odyssey. The price of microarray is getting cheaper and in near future developing countries will be able to implement such technology within their healthcare setting. To resolve diagnosis of NDD cases, we highly recommend to use whole genome microarray test in developing countries that eventually will lead to precision diagnosis for 10-20% of NDD cases and will enable the detection of novel variants and genes from underrepresented populations.

Declarations
The study was conducted according to the Declaration of Helsinki and was approved by the Institutional Ethical Review Committee (IERC) of Holy Family Red Crescent Medical College and Hospital, and all samples were collected with written informed consent.

Consent for publication
Written informed consent for publication was obtained from parents or legal guardians for all individuals involved in the study.

Availability of data and materials
Patient's phenotypic data is contained within the supplementary material. Genomic data can be shared for any collaborative research that involves NeuroGen Healthcare. Please request this via the corresponding author

Competing interests
The authors declare that they have no competing interests  Summary of CNVs identi ed in 212 NDD patients. A) Percentage of patients carrying pathogenic, VOUS and benign CNVs. Of all samples assayed, 12.26% patients carried pathogenic CNVs where 7.07% carried a pathogenic deletion and 5.19% a pathogenic duplication. B) Percentage of pathogenic, VOUS and benign CNVs. Of the total 1053 CNVs, 3% pathogenic, 17% VOUS and rest 80% are benign. C) Bars indicate the total number of CNVs in deletion and duplication. The green line represents the number of unique genes impacted by the corresponding variants. D) The percentage of male and female groups in the full cohort impacted by pathogenic CNVs. P value was calculated by Fisher's exact test. E) and F) Size distribution of pathogenic and VOUS CNVs. The average length of pathogenic deletion and duplication is 5365.26kb and 17451.72kb respectively. Whereas, the average length of VOUS deletion and duplication is 129.17kb and 476.76kb.  The false discovery rate (FDR) and p-value cut off was 0.01 and 0.001, respectively. P-values are denoted using color gradient (low p values with darker colors).