Loss of function of ADNP by an intragenic inversion

ADNP is a well-known gene implicated in intellectual disability and its molecular spectrum consists mainly in loss of function variant in the ADNP last and largest exon. Here, we report the first description of a patient with intellectual disability identified with an intragenic inversion in ADNP. RNAseq experiment showed a splice skipping of the inversed exons. Moreover, in-silico analysis of initiating ATGs in the mutated transcript using contextual Kozak score suggested that several initiating ATGs were likely used to translate poisonous out-of-frame ORFs and would lead to the suppression of any in-frame rescuing translation, thereby causing haploinsufficiency. As constitutive Alu sequences with high homology were identified at both breakpoints in reversed orientation in the reference genome, we hypothesized that Alu-mediated non-allelic-homologous recombination was responsible for this rearrangement. Therefore, as this inversion is not detectable by exome sequencing, this mechanism could be a potential underdiagnosed recurrent mutation in ADNP-related disorders.


INTRODUCTION
Trio-based exome sequencing of patients with intellectual disability (ID) identified a genetic cause in 16-45% of cases [1]. It is expected that a part of the missing diagnosis could be explained by yet unknown genes and genetic abnormalities that could not be detected by DNA microarray or exome sequencing (ES), such as translocation, large inversion, and intronic variants [2]. Genome Sequencing (GS) approach is anticipated to achieve a higher diagnostic yield than ES due to its wider and more homogenous coverage [2].
ADNP plays a role in embryonic development, especially during neuronal tube closure, and is involved in chromatin remodeling [3,4]. It stabilizes β-catenin by binding to its armadillo domain and enhances Wnt/β-catenin transcriptional activation [5]. The C-terminal part of ADNP directly interacts with ARID1A, SMARCA4, and SMARCC2, three essential components of the BAF complex, involved in the regulation of gene expression [3].
Monoallelic alterations of ADNP are a well-known cause of ID [6]. The main clinical features included hypotonia, severe speech and motor delay, mild-to-severe intellectual disability, and characteristic facial features (prominent forehead, high anterior hairline…). Most known variants are frame-shift or nonsense variants located in the last coding exon, which is thought to escape nonsensemediated decay, therefore, Helsmoortel et al. suspected a dominant-negative effect [6].
Recently, a few stop-gained variants or deletions affecting the two other coding exons and one whole gene deletion have been described [7][8][9], promoting haploinsufficiency as one of the pathological mechanisms.
Here, we present the case of a young patient with ID for whom the previous exploration by WES and SNP-array failed to identify a genetic abnormality. GS showed a large intragenic inversion, the first one described to date in the ADNP gene.

PATIENT DESCRIPTION AND RESULTS
The proband was a 3-year-old female with ID. Pregnancy and birth measurements were normal. At initial evaluation, weight, height and head circumference were 18 kg (+2 DS), 91 cm (−1 DS) and 50.5 cm (within normal range), respectively. Her global developmental delay, i.e, affecting all developmental areas, was predominated by language impairment with an age of speech onset at 21 months (only mom, dad). She was able to walk at 15 months. It was noticed a prominent forehead, recurrent infections and visual impairment with astigmatism and hypermetropia. Recent constipation was reported as well as behavioral and social interaction difficulties. Exploration by SNP-array, ES and research of fragile X syndrome associated CGG expansion, done in 2020, were negative. GS sequencing, bioinformatics and interpretation were carried out as described previously [2]. The analysis on a trio including the proband and her healthy parents showed a de novo intragenic inversion in the ADNP (Activity Dependent Neuroprotector Homeobox) gene at the heterozygous state: Chr20(GRCh37):g. 49515761_49525309inv NM_001282531.3:c.-89-3923_201 + 2793 inv (Fig. 1a). This structural variant encompassed exons 3 to 5 (according to the MANE transcript, ADNP has 6 exons compared to five in many alternative transcripts, i.e., exon 5 is the penultimate exon), involving the two first coding exons with the initiation Met1. The inversion and its de novo inheritance were confirmed by Sanger sequencing (Supplementary Data, Figs. S1 and S2).
We began by considering the hypothesis that the rearrangement could disrupt regulatory regions leading to an absence of transcription. To determine the consequence on the gene expression, impact over the transcription regulation was evaluated using the UCSC [10] tracks encode regulation and Candidate Cis-Regulatory Elements (cCREs), to verify if the reversed region contains or disrupts regulatory region. Most of the regulatory associated marks (histone acetylations, CpG islands) are located outside the inversion (Supplementary Data, Fig. S3), so there is no predictable consequence over the transcription regulation.
As for the creation of a 'reversed' pseudo-poisoning exon, we performed RNAseq on a peripheral blood sample from the patient. mRNA libraries preparation and sequencing were performed as described in the supplemental data.
First, the allelic balance could not be used to investigate a bi-or mono-allelic expression, as no coding variant was found at the heterozygous state, likely due to consanguinity (Supplementary Data, Fig. S4). Secondly, no aberrant exon was seen, excluding any 'reversed' splicing aberration. Nevertheless, seven aberrant exon1-6 and one exon2-6 spliced junction were identified in the patient. Junctions 2-6 were never seen among the controls and one junction 1-6 was observed among our 38 controls. Those results suggest skipping of the inverted exons by the splicing machinery (Fig. 1b, controls not shown). In addition, the comparison of blood expression of ADNP with 38 controls showed no decrease in expression (Fig. 2), further supporting a pathological mechanism without a diminished transcription of ADNP.
Therefore, we investigated the translation possibilities of such a mutated transcript, i.e., lacking the physiological initiating ATG. Potential downstream initiating ATGs were searched using TIS Predictor, which predicts translation initiation sites in a given nucleotide sequence providing a Kozak-based strength [11]. To define an ATG as a potential initiation start, we choose to use a cutoff of Kozak score >0.51 corresponding to the Kozak score of the physiological starting ATG, Met1, instead of the cutoff of >0.64 proposed by the authors. We identified six out-of-frame ATGs with a significant Kozak score (Fig. 1d) upstream of the first in-frame ATG (Met229), which had a non-significant Kozak strength. Further downstream, we identified two additional significant out-of-frame ATGs before the next significant in-frame ATG (Met254). These out-of-frame ATGs codons were predicted to initiate the translation of small Open Reading frames, likely diminishing the translating efficiency of the transcript.
Finally, the reversed sequence Chr20(GRCh37):g.49515761_ 49525309 and its flanking regions were analyzed using the online tool Repeat Masker [12] to search for repeat elements that could mediate genomic rearrangements and determine the mechanism involved. We detected Alu sequences in the reference genome (hg19) at both breakpoints, AluSq10 and AluY (Fig. 1c). Their comparison with a blastn analysis [13] showed that they have high homology (79% of identities, ev4e-72) and reversed orientation favorable for pairing.

DISCUSSION
Here, we report the first intragenic inversion of ADNP identified by WGS in a patient for whom previous explorations (WES, SNP-array, X fragile) were negative. Our patient's phenotype was coherent with the known ADNP-related phenotype spectrum, including some recurrent features of the Helsmoortel-van der Aa syndrome: language impairment, prominent forehead, hypermetropia, and recurrent infections [6]. Up to date, no inversion of the ADNP gene has been reported in the ClinVar [14], Decipher [15], HGMD [16], and gnomAD [17] databases. Interestingly, the rearrangement of our patient did not directly affect the last exon, where most of the pathogenic variants are located [6,7].
Inversions are a type of structural variants that are difficult to analyze owing to their balanced nature and the location of breakpoints, often within complex repeated regions. Different mechanisms have been described, such as non-allelic homologous recombination (NAHR) between inverted repeats, double-strand break repair mechanisms (non-homologous end joining), or replication-based mechanisms mediated by microhomology (fork stalling and template switching) [18].
The identification of Alu sequences at both breakpoints with high homology and reversed orientation suggested that this structural variant most likely involved NAHR, arising via a recombination event mediated by those inverted Alu sequences. As NAHR often leads to recurrent rearrangements, the presence of a prone-to-inversion Alu sequence pair in the reference genome could result in a recurrent cause of ADNP-related disorder, remaining underdiagnosed due to the challenge of detecting such variant. This hypothesis reinforces the relevance of exploring the pathological mechanism of this particular inversion.
First, RNAseq failed to evidence a monoallelic or a reduced expression of ADNP, suggesting a mechanism independent of mRNA reduction. Furthermore, the analysis of the regulatory hallmarks of ADNP was not in favor of a potential transcription dysregulation. Moreover, RNAseq did not identify any aberrant 'reversed' pseudo-exon but highlighted aberrant RNA junctions, seven exon1 to 6 and one exon2 to 6, suggesting skipping of the inverted exons.
Investigating the possible start codons of this mutated transcript, we found six out-of-frame putative ATGs codons with a significant Kozak strength upstream the first in-frame ATG (Met229, Kozak strength non-significant) and two others before the first significant in phase ATG (Met254). Met229 was predicted to initiate the translation of a short ADNP minor isoform (Uniprot A0A2R8Y6X0, 874aa) from the minor transcript ENST000 00645081.1 and Met254 may lead to an even shorter protein of 849aa. Both may have a dominant-negative effect. Nevertheless, as the number and the initiating strength of such upstream small ORFs have been shown to significantly reduce translation efficiency [19,20], we hypothesized that this inversion lead to haploinsufficiency because of such failure to initiate an in-frame translation.
In conclusion, we report the identification and characterization of a novel and possibly recurrent structural variant in ADNP. Such balanced anomaly arising in deep intronic regions could explain a part of the missing diagnosis by ES and SNP-array. Our results suggest that this inversion causes skipping of the reversed area and failure of any in-frame rescuing translation because out-offrame ATGs likely create small ORFs reducing the translating efficiency of the transcript.

DATA AVAILABILITY
The datasets generated during the current study are available from the corresponding author on reasonable request. This structural variant, NM_001282531.