Genetic Testing of Leukodystrophies Unraveling Extensive Heterogeneity in a Large Cohort: The Role of Five Common Diseases and Report of 38 Novel Variants

Background: This study evaluates the genetic spectrum of leukodystrophies and leukoencephalopathies in Iran. Methods: 152 children, aged from 1day to 15 years, were genetically tested for leukodystrophies and leukoencephalopathies based on clinical and neuroradiological ndings from 2016 to 2019. Patients with a suggestive specic leukodystrophy, e. g. metachromatic leukodystrophy, Canavan disease, Tay-Sachs disease were tested for mutations in single genes (108; 71%) while patients with less suggestive ndings were evaluated by NGS. Results: 108 of 152(71%) had MRI patterns and clinical ndings suggestive of a known leukodystrophy. In total, 114(75%) affected individuals had (likely) pathogenic variants which included 38 novel variants. 35 different types of leukodystrophies and genetic leukoencephalopathies were identied. The more common identied disorders included metachromatic leukodystrophy (19 of 152; 13%), Canavan disease (12; 8%), Tay-Sachs disease (11; 7%), megalencephalic leukodystrophy with subcortical cysts (7; 5%), X-linked adrenoleukodystrophy (8; 5%), Pelizaeus-Merzbacher-like disease type 1 (8; 5%), Sandhoff disease (6; 4%), Krabbe disease (5; 3%), and vanishing white matter disease (4; 3%). Whole exome sequencing (WES) revealed 26% leukodystrophies and genetic leukoencephalopathies. The total diagnosis rate was 75%. Conclusions: This unique study presents a national genetic data of leukodystrophies; it may provide clues to the genetic pool of neighboring countries. Patients with clinical and neuroradiological evidence of a genetic leukoencephalopathy should undergo a genetic analysis to reach a denitive diagnosis. This will allow a diagnosis at earlier stages of the disease, reduce the burden of uncertainty and costs, and will provide the basis for genetic of the study was to determine the common types of leukodystrophies and genetic leukoencephalopathies, neurological ndings in the patients, and ethnical distribution of the disease.


Background
Leukodystrophies and genetic leukoencephalopathies are a large heterogeneous group of genetic diseases affecting the white matter of the central nervous system. The single diseases are rare, but overall they affected 1 per 7663 live births, in a US American study [8]; the estimated prevalence of leukodystrophies is about 1-2/100 000 live births in Germany [4]. Most of these diseases are associated with severe progressive functional losses of motor and cognitive abilities, helplessness and early death. Their causes are either related to primary defects of myelin synthesis and myelin stability, but myelin damage may also be secondary to disturbances outside this structure [10]. Some mitochondrial and lysosomal storage disorders, organic acidemias, other inborn errors of metabolism and vascular disorders are also categorized under genetic leukoencephalopathies [4].
Leukodystrophies are clinically and genetically heterogeneous disorders; their diagnosis is challenging and nearly half of the patients will remain undiagnosed [1], putting a high economical and psychological burden on the society and the affected families. Many known genes have been recognized to cause these diseases, though there are many with unknown genetic etiology. Advances in gene sequencing procedures and whole exome sequencing (WES) unravel the genetic causes of leukodystrophies [6]. Genetic testing con rms the diagnosis and may offer a chance for disease-speci c palliative treatment or experimental therapies of some diseases (e. g. metachromatic leukodystrophy, Alexander disease, and Krabbe disease). In addition, molecular genetic analysis would help for family screening and reproductive decisions. Most of the pediatric disorders follow an autosomal recessive pattern of inheritance and come from consanguineous marriages which are prevalent in Iran and the Middle East. Despite advances in molecular technologies and the high frequency of genetic diseases in Iran as the crossroads of the Middle East, there is no comprehensive study on genetics of pediatric white matter disorders in this region of the world. The genetic composition of different parts of Iran could be representative of the respective neighbors.
Here, we have evaluated the genetic spectrum of subjects clinically diagnosed with leukodystrophies referred to a tertiary pediatric center in Iran. The purpose of the study was to determine the common types of leukodystrophies and genetic leukoencephalopathies, neurological ndings in the patients, and ethnical distribution of the disease.

Patients
Clinically diagnosed patients with white matter deterioration were enrolled in the study from different ethnicity of Iran between 2016 and 2019. Clinical characteristics of leukodystrophies and leukoencephalopathies were approved by pediatric neurologists. Demographic data, medical and family history, physical evaluations, neurological examinations, magnetic resonance imaging (MRI), and laboratory testing of each patient were recorded for each patient.
The study was approved by the ethical committee of Children's Hospital, Tehran University of Medical Sciences. Informed consent was obtained for genetic testing.
Next generation sequencing: gene-panel and whole exome sequencing (WES)   Those patients with inde nite clinical diagnosis or overlapping symptoms and neurological ndings underwent panel gene analysis to detect the genetic   cause. Panel based gene analysis was performed for cases for 59 genes involving in leukodystrophy, leukoencephalopathy and vanishing matter white   disease (Supplementary table 1). The coding regions and exon-intron boundaries of the genes were enriched using NimbleGen kit (NimbleGen, Roche, Basel, Switzerland). Sequencing analysis was performed by Illumina, Hiseq2000 (Illumina, San Diego, California, USA). Reads were aligned using Burrows-Wheeler Aligner (BWA) on reference genome (hg19) and annotated by SAMTools. Based on, 1000Genome and dbSNP database variant were selected for analysis.
Coverage of target region with at least depth of 30X was 99. 78%. In addition, whole exome sequencing (WES) was performed with an average coverage depth of ≈100X. Sanger sequencing was done for the candidate variants in the affected families.

Variant categories
The sequence data were compared with public databases and ltered to nd out the candidate variants according to published pipelines [10]. The candidate variants were categorized as the previously reported pathogenic variants and novel variants. ACMG guideline criteria were used to interpret novel variants and classify them [5].
Protein interaction STRING as a database for protein-protein interactions was used to gure out the interactions among proteins and co-expression of the studied proteins in human and other vertebrates. The studied proteins were investigated the co-expression in Homo sapiens to determine the function of proteins in cellular machinery by ProteomeHD[D1].
Totally, 108 of 152 patient (71%) had de ned MRI patterns (not available) and were clinically diagnosed with a known leukodystrophy. Measurements of lysosomal enzymes in MLD, KD, Tay-Sachs disease, Sandhoff disease were performed for diagnosis. Urinary sulfatides (for e. g. MLD), plasma very long chain fatty acids (for e. g. X-ALD) were also tested to help the diagnosis. These patients were candidates for single gene analysis. 44 of 152 patients (29%) had no de nite MRI pattern and no de nite biochemical or single gene test could be performed for them. They were candidates for panel gene analysis or WES.
Demographic, clinical and genetic evaluation of patients con rmed genetically 114(75%) patients were con rmed based on genetic testing. Male consist of 73 of 114(64%) of patients. The mean age of onset was 5yrs and 1m ± 18yrs and 11m.
Thirty-ve different leukodystrophies and genetic leukoencephalopathies were identi ed in this study (Table-1). The clinical characteristics of the most common genetically con rmed patients are summarized in table 1 and gure 1B. The main clinical manifestation was motor regression and neurological complaints including dystonia, hypotonia, developmental delay, ataxia, tremor, seizure, macrocephaly, nystagmus, cognition and learning impairment (Table-1 and Supplementary table-2).

Next generation sequencing: gene-panel and WES
Gene-panel and WES identi ed 40 of 152 (26%) patients having leukodystrophies and leukoencephalopathies (Table-1, Supplementary Table-2). Four cases did not show any variants with multigene panel analysis of leukodystrophies.
38 of 152 (25%) patients were not genetically con rmed based on genetic analysis. Some candidates of single gene analysis were not tested for panel based analysis because the parents were not satis ed for the test performance.

Categorization of patients based on protein location
Lysosomal disorders Forty nine of 114 patients were diagnosed as lysosomal disorders (28 lysosomal LD and 21 lysosomal gLE). Forty-one patients genetically were con rmed for MLD, TSD, SD and KD (Table-1).

Peroxisomal disorders
Eleven patients were diagnosed as peroxisomal disorders which eight of them were X-ALD. One patient had peroxisomal single enzyme beta oxidation de ciency, and two patients had peroxisomal biogenesis disorders (Table-1).

Errors of intermediary metabolism and other leukoencephalopathies
Forty patients diagnosed as errors of intermediary metabolism, consisted of 12 CD, 8 PMLD and 7 MLC (Table 1). CD as the most common degenerative cerebral diseases, due to abnormal amino acid/organic acid metabolism, accounted for the second most common disease in our population. PMD and PMLD are disorders of myelin genes. 4 patients had vWM, 2 patients with hypomyelination-hypogonadotropic-hypogonadism-hypodontia, 1 hypomyalination and congenital contract, 1 PMD, 1 AD, 1 infantile neuroaxonal dystrophy/atypical neuroaconal dystrophy, 1 hypomyelination leukodystrophy 9 (HLD9), 1 Cockayene syndrome, and 1 biotinidase de ciency (Table-1 and supplementary table- 12 mutations observed in at least two unrelated patients which were as follow: 6 MLD patients had Gly311Ser and 3 patients had c. 465+1G>A in ARSA gene (data under publication), two X-ALD patients had c. 1415_1416delAG in ABDC1, six CD patients showed c. 634+1G>T in ASPA, three patients (two homozygotes and one compound heterozygote) had c. 237_238insA in ASPA, each c. 118G>C(p. Ala40Pro) and c. 733T>A(p. Cys245Ser) variant in GJC2 were observed in two PMLD patients which were novel variants, c. 1528C>T in HEXA observed in 4 of 11 TSD patients (2 were from Turkish, 1 from Fars, 1 from Mazani ethnicity), and c. 509G>A in HEXA was found in two Gilak patients, two of six Sandhoff disease patients had c. 833C>T and three (two homozygotes and one heterozygote) MLC patients had c. 449_455delTCCTGCT and two MLC patients had c. 177+1G>T.
The variants were classi ed according to ACMG guideline; 12 variants met the criteria for being pathogenic, 18 and 11 variants were likely pathogenic and VUS, respectively.
One X-ALD patient had two variants as a haplotype because his mother was heterozygous for both (case5). These two variants considered as one haplotype, although they were classi ed as likely pathogenic variants.

In silico analyses
Protein interaction analysis predicted that the studied proteins had physical and functional relationships except proteins encoded by PLA2G6, RARS, SLC17A5, L2HGDH and MLC1 genes (Figure-2, supplementary table-3).
Functional association was predicted among different studied proteins. In co-expression prediction analysis (Data not shown), the following protein pairs such as POLG and SUCLA, in mitochondria were expressed together.

Discussion
Genetic diagnosis of childhood leukodystrophies is rapidly increasing throughout the past years in Iran and worldwide; approximately, 30 leukodystrophies and more that 60 disorders have been classi ed as genetic leukoencephalopathies [4]. This study provides a comprehensive spectrum of leukodystrophies and other genetic leukoencephalopathies in Iran as referred to a tertiary pediatric center. Totally, 35 types of leukodystrophies were determined in the studied population. Based on pattern of brain MRI and single gene analysis, approximately 69% of the referred patients were con rmed by direct Sanger sequencing. Clinical diagnosis reduced the number of genes to be evaluated. Panel based analysis also con rmed leukodystrophies in 26% of the cases. Our diagnostic rate of panel-based analysis was comparable to other studies [6]. Three patients were genetically undiagnosed with panel-based studies and WES/WGS is needed to de ne the causes. Consequently, we had 25% unsolved genetic cases and the diagnostic rate was 75% of leukodystrophies and genetic leukoencephalopathies in the study. Various novel variants identi ed, show that a high rate of allelic heterogeneity exists among our patients. A speci c composition of population living in Iran complicates this picture; different ethnicities with speci c cultural customs demand to run more speci c investigations on each population.
MLD was the most common cause of leukodystrophies in our population. The next diseases were CD, Tays-Saches, PMLD, X-ALD and then MLC. MLC is the most common (6 of 23) among Turk patients while PMLD may be common among Arab population in our study. Moreover, ten common diseases of this study, compromise 70% of all recognized patients (80 of 114) ( Table-1).
Clinically, we had unsolved cases due to variable phenotypic features or overlapping neurological manifestations which were candidates of gene-panel and/or WES analysis. Despite we had patients with no genetic diagnosis even though they had undergone panel-based analysis. This could be due to intronic variants, copy number variations, unknown gene defects, and multigenic effect. Therefore more genetic analysis should be performed for these cases.
Our understanding of in silico protein analysis and prediction analysis showed that interaction of proteins are beyond single cell type or physiological condition; proteins are highly speci c and could interact without binding e. g. transcription factors in expression regulation. The functional association predicted the interaction between two proteins to a joint biological function[D1]. Five proteins did not show any association with other proteins in leukodystrophies and leukoencephalopathies in our study. It is surprising that these proteins were not associated in leukodystrophies pathways and showed no functional interaction. To name, PLA2G6, MLC1, L2HGDH, RARS, SLC17A5 showed no association. All these proteins have relationship with metabolism and function of cells but did not show interaction in prediction analysis. Therefore, for rare diseases genetic analysis, WES may unravel more genes relating to leukodystrophies in patients with unsolved genetics.
Lysosomal diseases had 43% incidence in our studied population which could be managed at earlier age of diagnosis. Individuals with known causal variants bene t from unexpected clinical presentations, prognosis, pallitative treatment and avoiding unnecessary treatments. Hematopoietic stem cell transplantation (HSCT) has been used for lysosomal storage diseases [5]. Some of our patients might potentially have bene tted from HSCT at early stages of the disease. However, patients' follow up for HSCT is out of the scope of this study.
Some have an ethnic-speci c distribution, e. g. TSD in Ashkenazi Jewish population, GM1 gangliosidosis in Rudari isolate and MLD in Western Navajo Nation [1]. MLD patients were from western part of Iran (data under publication). Four of our TSD patients were from northern parts of Iran.
The peroxisomal disorders, as a heterogeneous group, occur due to a defect in function (e. g. X-ALD) and biogenesis (e. g. Zellweger spectrum) of peroxisomes. X-ALD is the most common peroxisomal disorder caused by mutation in the ABCD1 gene co-expressed with HSD17B4 gene (Figure-2). Patients with X-ALD could bene t from HSCT [7] or hematopoietic stem-cell gene therapy [17].
CD is the second frequent disease in our study. It is the most common disease during infancy and has been observed mainly in Ashkenazi Jews while in our study patients were from various ethnicities. Various experimental therapies for Canavan patients are under investigation [3]. Patients with known genetic etiology may bene t from such experimental therapies.
PMLD is responsible for 8% of hypomyelinating leukodystrophy patients [10]. In this study 7% of the patients had the disease. In addition to GJC2, mutations in other genes such as a Myelin-associated glycoprotein (MAG) gene have been reported to cause PMLD[Pt 9]. GJC2 is co-expressed with PLP1 and interacts with products of FAM1256A, POLR3A and EIF2B5 genes (Figure-2). Our results highlighted that PMLD may have a higher frequency than PMD in our population especially in Arab and Fars ethnicities.
Additionally, our prediction analysis showed that MLC1 protein had no interaction or expression with other studied proteins. Six of them were from Turk ethnicity; it may be a common disorder and limit to speci c ethnicity e. g. from Turkey.
11% of patients diagnosed with mitochondrial genetic leukoencephalopathies; Leigh syndrome and L-2-HGA accounted for 4 and 3 of them, respectively. Leigh spectrum was due to SURF1. Also, it was due to NDUFS1, NDUFS7 and SDHAF1 genes. L2HGDH encoding mitochondrial L-2-hydroxyglutarate dehydrogenase may be common in our ethnicities. Its protein showed no interaction with other proteins of our study, instead it has interaction with other proteins. The mechanism of leukodystrophy is very complicated and there may be proteins involved in disease progress which show overlapping phenotype but have no or unknown interaction with each other.

Analysis of founder effect and Hotspot mutations
Ancestral or founder effect or a genetic signature within an ethnicity usually leads to a high frequency and homozygosity of a mutation in that cohort; in contrast, if a speci c mutation is distributed uniformly among many ethnicities, it is known as a mutational hotspot. Haplotype analysis is used to de ne recognized that a mutation is a hotspot or a founder one. The studied mutations of ABCD1(c. 1415_1416delAG), ASPA(c. 634+1G>T and c. 237_238insA) and HEXA (c. 1528C>T) show a wide distribution around the world [2,1,4,6]; especially c. 634+1G>T in ASPA gene has been reported from Turkey for the rst time and we found it in patients from Fars, Afghani, Lur and Arab ethnicities [1]. These mutations are considered as hotspots i. e. they are mutated in many populations. Contrarily, mutations of MLC1 (c. 177+1G>T and c. 449_455delTCCTGCT) may have ancestors in Turk population. Especially, the c. 449_455delTCCTGCT variant was observed in three families; it may be originated from a founder ancestor in Turk population and it previously has been reported from Turkey [3].

Challenges and limitations
We have not included all the affected patients in our registry, only the patients referred to our center for genetic testing were accounted in this study. In addition, Children's Hospital is a tertiary center in Tehran and some patients around the country may have not been registered and/or died previously before registration. Therefore, a multicenter registry is needed. The incidence of the disease in this part of the world may be different due to consanguineous marriages. Ethnical background had higher incidence in Fars and Turk; however, the population of these ethnicities is also high in Iran.

Conclusion
In conclusion, this study gains other studies in the distribution of genetics of leukodystrophies in Middle East. Genetic analysis provides diagnostic con rmation of the disease, and physicians are allowed for prognosis and management of patients and affected families. Genetic testing following counseling decreases further worry of the family about the diagnosis and further costs. The mortality rate in affected families is very high and it underscores the necessity of genetic testing in the country. Moreover, treatments for some diseases at early stages are successful before initiation of presymptomatic stage. Enzyme replacement, metabolic correction, cell-based therapies at the right time increases the patient's life span. This study provides information to help for future therapeutic planning's in the country.

Declarations
Ethics approval and Patients' consent Ethical approval was supported by Growth and development research center, Tehran University of Medical Sciences ID number 98-02-80-43432. Informed consent was obtained from the patients.

Consent for publication
All contributing authors have read the manuscript and given their consent for the publication of this study.

Availability of supporting data
There are no additional unpublished data. MLD data is under publication.

Competing interests
None of authors declared any con ict of interest.