Evaluation of somatic and/or germline mosaicism in congenital malformation of the eye

Microphthalmia, Anophthalmia and Coloboma (MAC) form a spectrum of congenital eye malformations responsible for severe visual impairment. Despite the exploration of hundreds of genes by High-Throughput Sequencing (HTS), most of the patients remain without genetic diagnosis. One explanation could be the not yet demonstrated involvement of somatic mosaicism (undetected by conventional analysis pipelines) in those patients. Furthermore, the proportion of parental germline mosaicism in presumed de novo variations is still unknown in ocular malformations. Thus, using dedicated bioinformatics pipeline designed to detect mosaic variants, we reanalysed the sequencing data obtained from a 119 ocular development genes panel performed on blood samples of 78 probands with sporadic MAC without genetic diagnosis. Using the same HTS strategy, we sequenced 80 asymptomatic parents of 41 probands carrying a disease-causing variant in an ocular development gene considered de novo after Sanger sequencing of both parents. Reanalysis of the previously sequencing data did not find any mosaic variant in probands without genetic diagnosis. However, HTS of parents revealed undetected SOX2 and PAX6 mosaic variants in two parents. Finally, this work, performed on two large cohorts of patients with MAC spectrum, provides for the first time an overview of the interest of looking for mosaicism in ocular development disorders. Somatic mosaicism does not appear to be frequent in MAC spectrum and might explain only few diagnoses. Thus, other approaches such as whole genome sequencing should be considered in those patients. Parental mosaicism is however not that rare (around 5%) and challenging for genetic counselling.


INTRODUCTION
Microphthalmia, Anophthalmia and Coloboma (MAC) form a spectrum of related congenital eye malformations [1]. Microphthalmia refers to a reduced axial length of the eye of various severity (below two standard deviations of the age-adjusted population mean). Microphthalmia can be simple, when the eye is of reduced size, but anatomically intact, or complex, when associated with anterior segment dysgenesis (ASD) or defect in the posterior segment of the eye. Anophthalmia corresponds to the total absence of any tissue of the eye. Ocular coloboma refers to a segmental defect affecting all or parts of the iris, choroid, retina and optic nerve. These ocular defects can be unilateral or bilateral and are frequently responsible for severe visual impairment. Moreover, extraocular features, mainly neurodevelopmental disorders, are associated with the ocular anomaly in 33-95% of patients [2]. MAC are typically caused by absent or insufficient growth of the eye or by a failure of the optic fissure to close during early eye development [1]. The exact pathophysiology remains however poorly understood. Their aetiology can include environmental factors, but genetic alterations represent the major cause [1]. Despite the use of high-throughput DNA sequencing (HTS) approaches such as exome sequencing, 20-80% of patients remain without a genetic diagnosis after analysis [1,3,4]. Of note, the diagnosis rate is generally better when the ocular defect is severe, bilateral and syndromic.
Mosaicism is the consequence of a post-zygotic event resulting in the presence of a variant in a proportion of cells of the whole body [5]. This may result in clinical signs and/or risk of recurrence if the variant is present in the soma, the germline or both [6].
Detecting a mosaic variant with high throughput sequencing (HTS) requires sequencing at a greater depth of coverage (>100×) than required for the detection of constitutive variants (30-50×) and specific bioinformatics pipelines should be used. We can therefore miss a disease-causing variant when sequencing by using standard procedures designed for constitutive variant detection. As mosaic variants are known to have an impact on human genetic diseases [6][7][8][9], we wanted to investigate if an undetected mosaic pathogenic variant could explain part of unsolved cases of patients with a diagnosis of MAC. We therefore reanalysed previously sequenced data obtained with a panel of 119 ocular genes performed in 78 probands with a diagnosis of MAC using a dedicated pipeline conceived for mosaic variant detection.
On the other hand, parental germline (or gonosomal) mosaicism can lead to the recurrence of an apparently de novo diseasecausing variant. This has already been occasionally described in ocular defects [10][11][12][13] as in many other genetic conditions and should be taken into account in genetic counselling. Parental samples test are usually performed on leucocytes' DNA with Sanger sequencing, which cannot usually detect variants below 10%. We selected 41 probands with a diagnosis of ocular defect carrying an apparently de novo disease-causing variant after parental samples test by Sanger sequencing. We then sequenced parental samples of these 41 probands by using the same HTS strategy to check the pipeline's ability to detect low mosaicism of the offspring's variant missed by Sanger sequencing.

Probands with MAC and no genetic diagnosis
We selected 78 individuals (38 females and 40 males) with a diagnosis of unilateral (25 individuals) or bilateral (53 individuals) ocular defects belonging to the MAC spectrum. Among the 41 patients with a diagnosis of microphthalmia, six had simple microphthalmia, 16 had colobomatous microphthalmia and 19 had complex microphthalmia. Thirty-four patients had a diagnosis of simple coloboma and only one patient had a diagnosis of anophthalmia. Of note, extra-ocular features, mostly developmental delay, learning difficulties and growth retardation, were reported in one third of patients (25/78). None of them had a family history of ocular defects.
They were previously explored by the mean of a HTS panel of 119 genes of ocular development (Supplementary Data 1) without any diseasecausing variant identified. DNA was extracted from blood using the MagnaPure system (Roche Applied Science, Germany), except for three individuals whose DNA was extracted from other tissues (two from amniotic liquid, one from muscle foetal tissue). Capture probes were designed with SureDesign (Agilent, USA). A library of all coding exons and intron-exon boundaries was prepared using the SureSelect XT HS and XT Low Input Enzymatic Fragmentation Kit and the SureSelect XT HS Target Enrichment System for Illumina Paired-End Multiplexed Sequencing Library (Agilent, USA) following the manufacturer's instructions. Further sequencing was performed on a NextSeq500 plateform (Illumina Inc., CA). Sequence alignment was performed with BWA 0.7.10, picard-tools-2.18.23, elprep4 (Indel realignment, base recalibration). Then, the variant calling was made with GATK-3.3 (HaplotypeCaller) and Varscan2.3.7. Annotations were made with SNPEff-4.3 with additional information from gnomAD, ClinVar and dbSNP151.
Concerning the analysis of Copy Number Variations (CNV), screening was performed using a custom in-house pipeline previously described [14].
Prior to reanalysis, eight individuals were found to have a heterozygous pathogenic or likely pathogenic variant in a gene with an autosomal recessive inheritance of the ocular disease (Supplementary Data 2). Thus, they were not sufficient alone to explain patient's ocular phenotype.
For all probands, parental samples test was performed on both parents using Sanger sequencing and did not find offspring's disease-causing variants. Sanger sequencing was performed after targeted PCR amplification using specific primers on the proband's and parents' genomic lymphocytes' DNA. PCR products were then sequenced using an ABI3130XL or an ABI3500XL (Applied Biosystems, Foster City, California, U.S.A.). DNA sequence variants were identified using SeqScape™ Software v3.0 (ThermoFisher scientific, Waltham, Massachusetts, U.S.A.) and Sequencing Analysis Software V7.0 (ThermoFisher scientific).
In order to improve the detection of a potential parental gonosomal mosaicism that could have been missed by Sanger sequencing, parents were sequenced by the mean of a targeted HTS panel (described above).
In two families, only one parent was resequenced: (i) In family LA090161, the proband had a heterozygous deletion of the autosomal recessive disease RAX gene inherited from his mother and a heterozygous nonsense variant [c.665C>A p.(Ser222*)] involving RAX found in none of the parental samples. Only the father's DNA was resequenced as the mother transmitted the heterozygous whole gene deletion. (ii) In family SG040829, the PAX6 pathogenic variant [c.991C>T p.(Arg331*)] was found in a half-sister and half-brother with the same asymptomatic father. Thus, only the father was resequenced.
In total, 80 parents were sequenced using the 119 genes panel.
Mosaic detection using a targeted HTS panel: development of a new dedicated pipeline Our lab already explored 78 probands using our 119 genes HTS panel with no result allowing a diagnosis. We therefore directly analysed the raw data (FASTQ files) previously obtained in our laboratory. We performed the parental samples test of the 80 parents using the same sample used for Sanger sequencing (DNA extracted from blood). We applied the same procedure for the high throughput sequencing (HTS) step (described above) for parental samples test to generate the raw sequencing data. We reanalysed raw HTS data (FASTQ files) from all patients using a pipeline dedicated to mosaic variant detection. The overall pipeline performs trimming, sequence alignment (GRCh37) and duplicates suppression with CutAdapt [15], BWA-mem [16] and MarkDuplicate (http:// broadinstitute.github.io/picard/) respectively. Variant Calling step used three different variant callers in parallel: Mutect2 [17], VarDict [18] and FreeBayes [19]. The pipeline was able to detect a variant as low as 4% of allelic frequency at 100X depth in a previous test performed in our laboratory with known mosaic variants (data not shown). The standardisation of VCF files follows the VCF 4.3 specifications, which allows secondary fusion of co-occurring variants and haplotype normalisation and the production of a unique variant calling file (VCF) by sample. Annotation was made using VEP tool (Ensembl). We analysed annotated VCFs without filters on allelic frequency. Of note, we analysed the parents' annotated VCF with a focus on their offspring's pathogenic variant to avoid unwanted secondary findings. To make sure that our pipeline did not miss any parental mosaic, their binary alignment map (BAM) files were visualised at offspring's variant position using Integrative Genomics Viewer (IGV).

Probands with MAC and no genetic diagnosis
After reanalysis of previously sequenced data of a targeted 119 genes of ocular development HTS panel, we did not find any mosaic variant in the 78 individuals with MAC. All previously detected heterozygous variants of interest were found (Supplementary Data 2). The quality values of the run were there to allow mosaic variant screening with a mean depth at 716×, a 30× coverage at 99.65% and 100× coverage at 98.30%.

Parental test of apparently de novo pathogenic variants
Resequencing of the 80 parents revealed two parental gonosomal mosaics (Fig. 1). The PAX6 c.52G > C pathogenic variant associated with bilateral complex microphthalmia in the proband (SG110088) was found around 10% of cells in blood's DNA of her asymptomatic mother (allelic frequency: 13/252 = 5%) (Fig. 1A et  B). This mosaic variant has led to a recurrence identified on prenatal diagnosis. In another family, the SOX2 c.70_89del pathogenic variant associated with right anophthalmia and left Peters' Anomaly in the proband (ADN200107) was found around 4% cells in blood's DNA of her asymptomatic mother (allelic frequency: 9/441 = 2%) (Fig. 1C). Sanger sequencing of PAX6 or SOX2 performed previously using the same DNA samples failed to detect these two variants (Fig. 1BE). However, reanalysis of previously obtained sequences (.ab1 files) using Minor Variant Finder Sofware (Applied Biosystems) successfully found the 10% mosaic PAX6 variant but failed to detect the 5% mosaic SOX2 variant. Parental samples test of other parents did not reveal any other mosaic variant, despite high depth at each variant base in each family (mean depth: 1213X, minimal depth: 278X). Of note, in one family, germline mosaicism was highly suspected because of the recurrence in a girl (SG040829) and her paternal half-brother of the nonsense c.991C > T PAX6 variant associated with bilateral aniridia in both (Fig. 1D). HTS resequencing of the father did not detect the variant found in his offspring despite a good coverage (depth of 1965X at variant base) suggesting mosaicism confined to germline lineage. In this family, samples' identities were confirmed by PowerPlex® 16 HS System (Promega, Madison, Wisconsin, U.S.A.).

DISCUSSION
In a unique cohort of 78 patients with a MAC diagnosis, we did not find any additional genetic diagnosis after reanalysis with a pipeline conceived for mosaic variant detection. The use of targeted HTS allowed to have a sufficient depth for mosaic variants screening (98.30% at 100X, mean depth 716X). Despite this high coverage, we cannot exclude missing a variant with a very low allelic frequency (VAF < 0.1). The implication of such variant in patient's phenotype is however difficult to ascertain [20]. Furthermore, tissue specific mosaicism with a variant absent from patient's blood but present in other tissues (such as ocular tissue) could also have been missed as we used only leucocytes' DNA. This phenomenon has previously been described in structural variations [21] and some syndromes such as Cornelia de Lange syndrome [22,23]. Furthermore, Daich Varela et al. [24] described a suspected gonosomal SOX2 variant mosaicism absent from blood in a mother with asymptomatic uveal coloboma. Her son had a SOX2 variant associated with microphthalmia and developmental delay. Parental samples test analysis using exome sequencing did not find offspring's SOX2 variant in mother's blood nor in her saliva. In a large cohort of probands with severe developmental disorders (DDD Study), there was however no significant variation of the levels of mosaicism between saliva and blood [20]. Although tissue specific mutation is conceivable in MAC disorders, we did not have access to other tissues, in particular the ocular ones.
In a recent review, Ohuchi et al. [25] underlined the potential role of mosaicism in congenital eye anomalies, partly explaining Viewer (IGV) of the mosaic variant in asymptomatic mother's blood. B Sequence visualisation of the PAX6 variant using Sequencing Analysis (up) and Minor Variant Finder software (MVF; down). Note that the mother's mosaic variant (that corresponds to the background noise level with Sequencing Analysis) is detected by MVF. C Pedigree of ADN200107 with a pathogenic SOX2 c.70_89del variant with Bam files visualisation on IGV of the mosaic variant in asymptomatic mother's blood (up) and her affected daughter (down). D Pedigree of SG040829 with a pathogenic PAX6 c.991C > T variant found in both children with aniridia but not found in their affected father as shown in Bam files visualisation on IGV (up). E Sequence visualisation of the SOX2 variant of ADN200107 using Sequencing Analysis showing absence of distinguishable variant in mother's sample.
phenotypic variability. Several cases of parental mosaicism have in fact been reported with phenotypes of various severity in parents ranging from asymptomatic to severe, without any clear correlation with the mosaic rate [11,12,26]. In example, Ragge et al. [27], described the segregation of an OTX2 disease-causing variant in two children with bilateral complex microphthalmia and a less severe phenotype observed in their mother who had the variant at mosaic state (around 20% in blood) and display pigmentary retinopathy without ocular malformation. Thus, somatic mosaicism could result in less typical or milder clinical presentation and therefore no exploration is done before the occurrence of a more severe phenotype in offspring. In the same way, somatic mosaicism might partly explain unilateral defects that are frequent in congenital eye defects but with no pathophysiological explanation to date. In literature, all mosaic Single Nucleotide Variations (SNVs) reported to date in MAC phenotypes are parental mosaics and were found after the identification of the heterozygous variant in offspring. However, mosaic CNV in PAX6 have previously been described in a cohort of patients with aniridia [28]. Of note, in this last study, patients with a mosaic variant (30-62% of cells) did not show significant clinical differences with patients harbouring a constitutive variant.
Thus, somatic mosaicism does not appear to be a frequent mechanism of variant apparition in MAC and systematic screening for mosaic variants, using the same sample, after negative molecular screening does not seem relevant and other techniques or samples should be preferred. In order to increase the diagnosis rate in those patients, Whole genome sequencing would allow analysis of previously not explored genomic regions (other genes, introns, intergenic regions). Furthermore, long-read sequencing could also help finding more diagnoses with a better genome coverage than short-read sequencing and detection of structural variations [29].
On the other hand, parental gonosomal mosaicism seems to be not that rare in ocular defects [25]. In our cohort of 80 parents of 41 probands with a disease-causing variant, presumed de novo after parental Sanger sequencing, we found two variants at low mosaic state in asymptomatic parents. This means that at least 5% (2/41) of the asymptomatic parents could be mosaic. This percentage is probably underestimated as illustrated by the probable mosaic associated with a recurrence that was not detected in the paternal sample (SG040829). These three parental mosaics (two confirmed, one highly suspected) represent 7% (3/41) of our cohort. Demonstration of parental mosaicism is important for genetic counselling as it induces a higher recurrence risk as shown in family SG110088. Prenatal diagnosis should therefore be discussed with the couples, even in case of negative parental testing. The use of HTS confirms the limits of Sanger sequencing to detect low mosaic variants. Performing exome and whole genome sequencing usually requires trio sequencing of the proband and both parents, which will theoretically rule out the limitations of Sanger sequencing. HTS however still have some limitations to be taken in consideration such as non-detection of parental low mosaicism by standard pipelines or insufficient depth [20]. Furthermore, when analysing only peripheral blood DNA, we could have underestimated the risk of recurrence, missing a germline mosaicism absent from blood. This is the case in one family (SG040829) with a paternal highly suspected germline mosaicism of a PAX6 disease-causing variant not detected in blood but confirmed by the recurrence in two of his children (with different mother). Although it is the most used sample in routine, blood might not be the best-suited sample to detect parental mosaicism in ocular development diseases. In a family with an Alport syndrome-affected child, Dai et al. [30] showed paternal germline mosaicism not found in blood sample, but detected at 2.65% cells in the father's sperm sample. Thus, it would be interesting to look for parental mosaicism in other tissues (saliva, urine, sperm samples) to see if another easily accessible tissue is more appropriate for mosaics' detection. However, although screening of multiple tissues would help to refine the risk of recurrence, it would never allow the exclusion of germline mosaicism and recurrence risk in clinical practice.
In conclusion, mosaic variants detection represents a challenge for both diagnosis and genetic counselling, especially in MAC disorders. Firstly, when considering the fact that the clinically relevant tissue is absent or inaccessible in those disorders, this study represents a unique work showing that mosaic variants in blood are rare in patients with MAC. Secondly, we show here that parental mosaicism is higher (at least 5% and probably underestimated) than the theoretical risk of recurrence usually gave in consultation to the parents of around 1%.

DATA AVAILABILITY
All data generated or analysed during this study are included in this published article and its supplementary information files. Newly described variants have been added to ClinVar database (SCV002583597 -SCV002583623).