Autistic traits in myotonic dystrophy type 1 due to MBNL inhibition and RNA mis-splicing

Tandem repeat expansions are enriched in autism spectrum disorder, including CTG expansion in the DMPK gene that underlines myotonic muscular dystrophy type 1. Although the clinical connection of autism to myotonic dystrophy is corroborated, the molecular links remained unknown. Here, we show a mechanistic path of autism via repeat expansion in myotonic dystrophy. We found that inhibition of muscleblind-like (MBNL) splicing factors by expanded CUG RNAs alerts the splicing of autism-risk genes during brain development especially a class of autism-relevant microexons. To provide in vivo evidence that the CTG expansion and MBNL inhibition axis leads to the presentation of autistic traits, we demonstrate that CTG expansion and MBNL-null mouse models recapitulate autism-relevant mis-splicing profiles and demonstrate social deficits. Our findings indicate that DMPK CTG expansion-associated autism arises from developmental mis-splicing. Understanding this pathomechanistic connection provides an opportunity for greater in-depth investigations of mechanistic threads in autism.

assessed social interaction de cits in speci c DM1 mouse models, a Dmpk 3'UTR CTG exp knock-in (KI), as well as a Mbnl knock-out (KO) mouse models. Our results provide insights into the molecular mechanism underlying DM1-associated ASD where developmental mis-splicing of ASD-linked genes arises by loss of MBNL activity due to CUG repeat expansions.

ASD-risk gene mis-splicing in human DM1 prefrontal cortex
The prefrontal cortex orchestrates executive functions affected in ASD, and previous studies reported transcriptome-wide changes in this brain region 29,37 . To test the hypothesis that the DMPK 3'UTR CTG exp mutation leads to mis-splicing of ASDrisk genes, we analyzed human prefrontal cortex (Brodmann area 10; BA10) RNA-seq data generated from DM1 (unknown ASD status) and unaffected control samples (Supplementary Table 1) 38 . For differential AS analysis, we computed the change of percent spliced in (DPSI) for skipped exons (SE), mutually exclusive exons (MXE), alternative 5′ and 3′ splice sites (A5SS and A3SS) and retained introns (RI). Of all identi ed AS events (100%) and genes (100%) in DM1 cortex splicing analysis, 1% of AS events met our mis-splicing criteria in the total pool of 7% of mis-spliced genes ( Fig. 1a and Supplementary Fig. 1a). To investigate the DM1 splicing pro le in genes related to ASD, we retrieved 38 ASD-relevant gene sets from previous studies and available databases (SupplementaryTable 2). Our statistical analysis revealed a signi cant enrichment of mis-spliced events for 76% of the gene sets ( Supplementary Fig. 1b). Importantly, there was a signi cant enrichment of genes from the Simons Foundation Autism Research Initiative (SFARI; OR = 2.2, FDR = 1.6 x 10 -11 ) database ( Fig. 1b and Supplementary Fig. 1b), including SFARI's 'high con dence' (Score 1; OR = 2.7, FDR = 7.3 x 10 -6 ), 'strong candidate' (Score 2; OR = 1.8, FDR = 2.7 x 10 -2 ) and 'suggestive evidence' (Score 3; OR = 1.9, FDR = 1.1 x 10 -4 ) gene categories. Our analysis also revealed a signi cant enrichment of high-con dence ASD-risk genes identi ed in two large Autism Speaks MSSNG-based whole-genome sequencing studies: MSSNG-2017 39  To test whether the level of ASD-risk gene mis-splicing was associated with the degree of CTG exp in DM1 prefrontal cortex, we correlated the CTG repeat length with the mean |DPSI | values for the mis-spliced ASD-risk genes in DM1. We selected previously determined repeat sizes corresponding to the 90 th percentile of CTG length distribution (Supplementary Table 1) since a previous study demonstrated the strongest positive correlation between those CTG sizes and general mis-splicing level in DM1 prefrontal cortexes 38 . This analysis revealed a signi cant positive correlation between CTG exp size and the number of mis-spliced events in ASD-risk genes from the SFARI (r = 0.83, P = 0.02) and MSSNG-2017 study (r = 0.81, P = 0.03) ( Fig. 1e and Supplementary Fig. 1d). Collectively, these results indicated that the DMPK 3'UTR CTG exp mutation in the prefrontal cortex perturbs the splicing of ASD-relevant genes.
Since previous studies have shown that miEs can locally modulate protein structure 30 , we performed comparative in silico modeling of peptides with/without miE-encoded amino acid (aa) sequences to test their potential for protein modulation.
This analysis showed that some mis-spliced miEs might modulate internal (e.g., Ank2 and Nrxn1) or C-terminal (e.g., Dmd and Shank3) protein structures ( Fig. 2h and Supplementary Fig. 2c-d). For example, the inclusionof thehighly conserved Ank2 miE (12 nt) along with the use of a proximal alternative 3′ splice sites (A3SS) results in protein isoform with a TIP aa sequence, whereas miE exclusion promotes distal A3SS usage (15 nt), and results in a protein isoform with a LRSF aa sequence containing a S901 phosphorylation site 47 (Fig. 2h and Supplementary Fig. 2c). For Dmd, a 32 nt miE modulates the structure of the highly conserved dystrophin C-terminus that interacts with other proteins 48 ( Supplementary Fig. 2d).
Regulation of the ASD-risk gene splicing program during cortex development To assess the developmental splicing pattern of ASD-risk genes, we analyzed gene expression data for ve mammalian, including human, brains at different developmental stages 49 . Our analysis showed an evolutionarily conserved increase of MBNL2 expression during neonate/P0 to middle childhood/P14 brain development ( Fig. 3a and Supplementary Fig. 3a). Although MBNL1 expression increases simultaneously, its expression in the developed brain is approximately 3-fold lower than MBNL2. To assess the association between Mbnl1 and Mbnl2 gene expression and MBNL-sensitive splicing transitions in the developing mouse cortex, we evaluated RNA-seq data from WT mice 50 . We computed mean |DPSI | values at nine developmental time points for AS events mis-spliced in ASD-risk genes in the Mbnl cDKO cortex and correlated them with Mbnl expression levels. As anticipated, the correlation between these variables was very strong for ASD-risk genes from the SFARI (r = 0.89, P = 1.  Supplementary Fig. 3b). Differential AS analysis demonstrated that 48-56% of mis-spliced AS events in ASD-risk genes were signi cantly changed between embryonic and adult cortex ( Supplementary Fig. 3c). For example, Scn2a1 MXE, Ank2 miE, Tanc2 miE, and Dmd miE splicing transitions occurred at early developmental stages to reach a plateau postnatally between two and four weeks of age ( Fig. 3c-d and Supplementary Fig. 3d), which is consistent with the developmental expression patterns of Mbnls (Fig. 3a).
To assess whether prenatal MBNL loss in uences splicing of ASD-risk genes, we analyzed RNA-seq data of the primary embryonic cortical neuron samples from Mbnl cDKO, constitutive Mbnl1 -/-KO (hereafter Mbnl1 KO), constitutive Mbnl2 -/-KO (hereafter Mbnl2 KO) and WT mice 51 . We performed differential splicing analysis followed by ASD-risk gene enrichment analysis. In agreement with the relatively low embryonic Mbnl1 and Mbnl2 expression levels ( Fig. 3 and Supplementary Fig.  3a), we did not observe signi cant enrichment of mis-splicing for ASD-risk genes from the SFARI (OR = 1.  Fig. 3e). To further investigate the impact of DMPK CTG exp mutation on the ASD-risk gene splicing program in the developing human brain, we also analyzed DM1 and control brain organoid RNA-seq samples 52 followed by the differential splicing and ASD-risk gene enrichment analyses. We found a signi cant enrichment of mis-spliced events in ASDrisk genes from the SFARI (OR = 1.6, FDR = 6.6 x 10 -11 ), MSSNG-2017 (OR = 3.3, FDR = 1.3 x 10 -6 ) and MSSNG-2022 (OR = 2.6, FDR = 3.6 x 10 -7 ) studies in the DM1 brain organoid, including previously identi ed DMD miE ( Fig. 3g-h). Overall, these results indicated that MBNL proteins govern the splicing patterns of multiple ASD-risk genes, including miEs, in the developing brain.

MBNL2 loss causes ASD-risk gene mis-splicing in multiple brain regions
Mbnl2 is the predominant gene paralog expressed in the adult human and mouse cerebral cortex, hippocampus, and cerebellum ( Fig. 4a and Supplementary Fig. 4a-b), and these brain regions are known to be involved in ASD 53,54 . To test the hypothesis that Mbnl2 loss perturbs splicing of ASD-risk genes in multiple brain regions, we performed RT-PCR splicing analysis of Scn2a MXE, Nrxn1 miE and Shank3 miE in frontal cortex, hippocampus, and cerebellum of adult Mbnl2 KO and WT mice. Two of three tested AS events demonstrated the most profound mis-splicing in the hippocampus ( Fig. 4b and Supplementary Fig. 4c-d). Thus, to investigate Mbnl2-mediated AS regulation in ASD-risk genes in the hippocampus, we performed differential splicing analysis on RNA-seq data from Mbnl2 KO 55 . In total, 4% of AS events were perturbed in 8% of detected genes, including Scn2a, Ank2, Nrxn1, and Shank3 (  Supplementary Fig. 4g). Approximately 9% of all misspliced ASD-risk genes in Mbnl2 hippocampus overlapped with those found in DM1 prefrontal cortex and Mbnl cDKO frontal cortex (Fig. 4g). The most consistently mis-spliced events were Ank2 miE and Scn2a MXE. Therefore, Mbnl2 loss alone impacts the alternative splicing of ASD-risk genes in multiple ASD-relevant brain regions, including the hippocampus.
SRRM4 protein promotes neuronal miE inclusion by binding to an intronic UGC motif approximately 15 nt upstream the 3′SS of targeted exon 61 . In contrast, MBNL proteins bind to downstream intronic UGCY motifs to promote alternative exon inclusion 58, 62 . To support that MBNL and SRRM4 regulate Ank2 miE inclusion binding to distinct sequences, we retrieved available CLIP-seq data from an N2a cell line expressing agged SRRM4 protein 61 . As expected, we identi ed a SRRM4-CLIPseq reads cluster covering a conserved UGC motif 9 nt upstream Ank2 miE, and there were no reads supporting SRRM4 interaction with the MBNL binding site and vice versa (Fig. 5f, 6f and Supplementary Fig. 6e).
In contrast to MBNL2, SRRM4 has a relatively higher expression in embryonic compared to postnatal brain in human and mouse (SupplementaryFig. 6e). As predicted, Srrm4 and Srrm3 gene expression levels were unchanged in Mbnl cDKO frontal cortex, Mbnl2 KO hippocampus, and Mbnl DKD CAD cells (SupplementaryFig. 6f). Interestingly, we noticed the signi cant 28% reduction of SRRM4 RNA in DM1 brain, however this downregulation did not correlate with CTG exp (r = -0.41, P = 0.36) (SupplementaryFig. 6f-g). These results indicate that the MBNL and SRRM proteins regulate splicing of ASD-relevant miEs, such as ANK2 miE (Fig. 6g), in an independent manner. Social interaction de cits in Mbnl2 knockout and Dmpk 3'UTR CTG exp knockin mice.
Ekström and colleagues have reported that DM1 children have a higher incidence of impaired social interaction and communication skills 63 , and thus we tested sociability in our DM1 mouse models using the three-chamber test. The threechamber test involves three phases: habituation, sociability, and social novelty 64 (Fig. 7a). We rst selected heterozygous  (Fig. 7b). In contrast, homozygous Dmpk-(CTG) 480/480 KI mice showed no signi cant preference for the chamber with novel animal over the novel object (Fig. 7b), signifying a lack of sociability.
To test the hypothesis that MBNL inhibition underlies the social de cit, we evaluated Mbnl2 KO and Mbnl1 KO mouse models. The Mbnl1 KO is characterized by muscle (e.g., myotonia), immune system and vision pathology 66, 67 , whereas the Mbnl2 KO exhibits central nervous system abnormalities, including neuronal morphology and synaptic changes 45,55,68 . Like homozygous Dmpk-(CTG) 480/480 KI, and in contrast to WT, Mbnl2 KO mice did not spend signi cantly more time in the chamber with a novel animal (Fig. 7c). Additionally, Mbnl2 KO mice also showed no signi cant preference for social novelty when presented with a familiar animal (Stranger 1) and a novel animal (Stranger 2) in the social novelty phase (SupplementaryFig. 7a). Since Mbnl1 is the dominant Mbnl paralog expressed in skeletal muscles, testing Mbnl1 KO mice in the social test failed to provide reliable results due to their profoundly limited mobility evident during the habituation phase (Fig. 7d). In contrast, Mbnl2 KO mice did not exhibit signi cant exploratory locomotor de cits in the three-chamber test and the open-eld test (Fig. 7d and Supplementary Fig. 7b).
These mouse behavioral results showed that either Dmpk-(CTG) 480/480 expression or MBNL2 protein loss led to social interaction de cits, a key diagnostic feature of DM1-associated ASD. The variability observed in the three-chamber test for both homozygous Dmpk-(CTG) 480/480 and Mbnl2 KO mice suggests incomplete penetrance of this phenotype.

Discussion
Here, we delineate the mechanisms underlying a speci c ASD-linked tandem repeat expansion and its phenotypic consequences. We provide evidence that DMPK 3'UTR CTG exp and its subsequent inhibition of MBNL's RNA splicing activity adversely impacts the developmental ASD-risk gene splicing program, which leads to social interaction de cits, as we demonstrated in the mouse models. Thus, we propose that ASD can arise from a gene-speci c tandem repeat expansion through an RNA-mediated gain-of-function mechanism whereby symptoms are a consequence of altered RNA splicing of multiple ASD-risk genes during brain development.
Aberrant RNA splicing is a characteristic feature of the ASD brain, including neuronal miE mis-splicing shown in approximately one-third of ASD cases 27,30 . Although miEs are regulated by multiple RNA-binding proteins, their abnormal exclusion in ASD brains has been linked to downregulated SRRM4 expression. For example, the ANK2 miE 12 nt analyzed in this study is commonly mis-spliced in both DM1 and ASD brains and is co-regulated by MBNL and SRRM4 proteins. Like MBNL inhibition, SRRM4 haploinsu ciency not only causes miE mis-splicing, but also a social de cit in mice 33 . Additionally, AS events in the DM1 brain can mimic ASD-associated variants. For example, SCN2A MXE mis-splicing results in a protein isoform differing by a single negatively charged amino acid (adult-to-fetal: D209N) in the extracellular loop of the Na v 1.2 channel voltage-sensing domain (Supplementary Fig. 6d). Previous research has demonstrated that similar to the 'fetal' MXE inclusion, ASD-associated SCN2A variants reduce neuronal excitability 69-71 . The role of mis-splicing in DM1-associated ASD is additionally supported by recent clinical trial results for tideglusib (AMO-02). ASD symptoms were improved in some of the treated children with DM1 72 . In preclinical studies tideglusib, a small-molecule inhibitor of glycogen synthase kinase 3 (GSK3), reduces CUG exp RNA levels and corrects aberrant splicing in DM1-derived cells and two DM1 repeat expansion mouse models 73 .
Studies on tandem repeat expansions provide a unique opportunity to investigate the mechanistic threads in ASD, as was successfully demonstrated for the prototypical example of the CGG expansion in the FMR1 5' untranslated region (5'UTR).
The FMR1 5'UTR CGG exp underlies Fragile X-Associated Disorders, including Fragile X Syndrome (FXS) which is the most common monogenic disorder comorbid with ASD 74 . Here, we provide a molecular mechanism for the DMPK 3'UTR CTG exp as a second example of a tandem repeat expansion leading to ASD traits.

Mouse models
All relevant ethical regulations for animal testing and research were observed, and this study received approval from the University of Florida Institutional Animal Care and Use Committee (IACUC). All animal procedures and endpoints were in accordance with IACUC guidelines and animals were sacri ced in accordance with IACUC-approved protocols. B6.129S1- performed between 8 weeks and 6 months of age followed by brain harvesting. Mice were housed under speci c pathogen-free conditions. Both the humidity (50%-70%) and temperature (70-75°F) were controlled, and the room was maintained on a 12:12 light:dark cycle (lights off at 8:00 pm). Mice were ear-notched, and tail-snipped for identi cation and genotyping. Same-sex littermates were group-caged (2-4 mice/cage) at weaning in cages with water and standard rodent chow available ad-lib. The mice remained in the same cage group throughout the behavioral experiments.

Three Chamber Test
Three chamber test was used to assess sociability in mouse models. The rectangular three-chambered apparatus consisted of three 20 cm x 40.5 cm x 22 cm chambers separated by clear Plexiglass walls. The walls had small doors that could be lifted or closed between phases to allow chamber access or prevent it. Throughout the test, the center chamber remained empty, and objects or target (Stranger) mice were placed in the left or right chambers. The test mouse was the mouse that had its behavior analyzed. The target mice (Strangers 1 and 2) were matched in both age and sex to the test mouse and were placed into the test to provide a social stimulus. The test mice did not undergo any other experiments prior to being placed in the three-chambered social test. Similarly, the target mice only were subject to being novel mice in the three-chambered test and were not involved in any other experiments. Wire cups were used to con ne the target mice while allowing for social investigation by the test mouse. Before beginning, the two target mice were habituated for ten minutes in the inverted wire cups that they were subsequently placed in during the social test. During this habituation, we observed if target mice exhibited aggression or abnormal behaviors, such as excessive grooming, bar-biting, and jumping, that could interfere with the test and provided grounds for their exclusion. None of the target mice used in this study met these criteria for exclusion.
The habituation phase for the test mouse followed the habituation of the target mice. All chambers were completely empty during this phase. This phase allowed the test mouse to acclimate to the chambers and allowed us to assess if they showed a preference for one side before any novel objects or animals had been placed in the chamber.
For the sociability phase, an empty inverted wire cup was placed in one chamber while an inverted wire cup with one of the target animals (Stranger 1) was placed in the chamber on the opposite side. The chamber that contained the target animal alternated with each animal that was being tested. The test mouse was placed in the center chamber, and once the doors were lifted, left to explore all chambers for ten minutes. The test animal was allowed to interact with the cup with or without a social partner present for ten minutes.
For the social novelty phase, the same target animal (Stranger 1) that was used in the sociability phase remained in its place and the previously empty wire cup became occupied by a novel target mouse (Stranger 2). The test mouse was placed in the center chamber to begin and allowed to explore all chambers for ten minutes once the doors were lifted. This phase assessed whether the animal displayed more investigative behavior towards the novel target mouse (Stranger 2) or displayed a preference for the familiar mouse (Stranger 1).
After the three phases of the social test were completed and the animals were placed back in their home cages, the interior of the chambers and the wire cups were sanitized with ethanol before proceeding with another test mouse. Illumination was kept even on both sides of the apparatus. The test was conducted in a quiet room with minimal visual distractions and was recorded overhead using a video camera. Each video speci ed the date, test animal ID, and target mice used.
Mouse video tracking during habituation phase was performed using ToxTrac (v 2.98) 76 .The recorded videos were observationally coded by human raters using Behavioral Observation Research Interactive System (BORIS v 8.1.2) software 77 . Time in each chamber and the number of social/object interactions were coded during the sociability and social novelty phases, respectively. Social interactions were operationally de ned as the test mouse sni ng the target mouse, which could include nose-to-nose interaction, the test mouse sni ng any other part of the body of the target mouse, or noseto-cup interaction, and rearing on the wire cup with the target mouse 64 . Object interactions only applied to the sociability phase and were de ned as sni ng or rearing on the wire cup that did not have a target mouse in it. During the social novelty phase, social interactions were coded for both chambers, differentiating which animal was the novel one and which was the familiar one. Twenty per cent of the coded observations were randomly selected and independently scored by another researcher to determine the agreement between raters. A criterion of 85% or greater inter-observer agreement was established. If the behavioral scores were recorded between 1 second of each other for point events and, for durations of behavior, were 2 seconds of the start and stop time, it was counted as a scoring agreement. These parameters were set to account for the reaction time of the scorers. All the data included in this study met the criteria for 85% inter-rater agreement.

Automated Open Field Test
Mice were acclimated to the procedure room for approximately two hours before the test. For the open eld, test mice were then placed in the center of the darkened activity-monitoring 17"

RT-PCR Splicing Analysis
Total RNA (1-2 µg) was reverse transcribed using the GoScript Reverse Transcription System (Promega)/High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher Scienti c) with Random Primers (Promega, Thermo Fisher Scienti c) according to the manufacturer's protocol. PCR was conducted using GoTaq G2 Flexi DNA Polymerase (Promega). PCR products were resolved on 2% agarose gels stained with ethidium bromide and gels visualized on a Molecular Imager ChemiDoc XRS + (BioRad)/G:Box (Syngene) and analyzed using Image Lab (BioRad)/GeneTools software (Syngene). All primers and PCR product sizes are listed in the key resources table.

RNA-seq and CLIP-seq Analysis
All RNA-seq and CLIP-seq data accession numbers are listed in the key resources table. Reads were aligned to the human hg38 or mouse mm10 genomes using STAR (v 2.7.5c) 78 . Splicing analysis was performed using rMATS (v 4.1.0) 79 . Sashimi plots were generated using ggsashimi.py script 80 . Median coverage was used to generate the plot (-A median). The total numbers of junction reads are showed. The introns were compressed for better representation (--shrink). Transcript expression quanti cation was performed using Salmon (v 1.1) 81 , and differential gene expression analysis was performed using DESeq2 (v 1.32.C) 82 .

ASD-risk Gene Datasets
See Supplementary Table 2.

Protein Structure Prediction
The modeled structures of mouse proteins up to 50 aa or 214 aa for SHANK3 were predicted using the UCSC ChimeraX AlphaFold tool with the use of ColabFold, an optimized version of AlphaFold2 with default parameters 83, 84 . Protein fragments used for structure modeling with miE-encoded residues are underlined. Allen Mouse Brain Atlas Mouse Brain Atlas (mouse.brain-map.org). Experiments were performed on P56d old male C57BL/6J mice.For a detailed description of in situ hybridization (ISH) procedure and informatics data processing see: help.brainmap.org/display/mousebrain/Documentation.

Group Size
Group size determinations were based on assuming power = 0.8, α = 0.05 with effect sizes estimated based on our previous studies using G*Power (v 3.1) software. RNA-seq maximum group sizes and sample characteristics were predetermined. We analyzed sex-and age-matched groups.

Statistical Analysis
Whole transcriptome statistical analysis for splicing and gene expression was performed using rMATS (v 4.1.0) 79 and DESeq2 (v 1.32.C) 82 , respectively. The odds ratio (OR) was calculated using 'epitools' package in R, and the statistical signi cance was determined based on Fisher's exact test followed by the multiple comparison correction using the FDR method. Other statistical analyses were performed using GraphPad Prism (v 9.5.1). The normal distribution was assessed by the Shapiro-Wilk test followed by parametric or nonparametric tests and the post hoc test for multiple comparisons. Graphs were generated in R using the 'ggplot2' package and GraphPad Prism (v 9.5.1) software. Details are speci ed in the gure legends.

Declarations
The data used for the analyses described in this manuscript were obtained from dbGaP accession number phs000424.v8.p2 on 01/12/2023. The ASD RNA-seq data for this publication were obtained from the NIMH Repository & Genomics Resource, a centralized national biorepository for genetic studies of psychiatric disorders. Data used in this study were generated as part of the PsychENCODE Consortium, supported by R01MH094714 awarded to Daniel Geschwind.    and upper (75 th %ile) quartiles. Whiskers show minimum and maximum. Number of mis-spliced AS events are provided as n value. Statistical differences were determined by Kruskal-Wallis test followed by Dunn's multiple comparison test: # P = 0.067, * P = 0.038, *** P = 0.0006, and **** P < 0.0001. f, Sashimi plot of embryonic Mbnl cDKO (N = 2) and WT (N = 2) RNAseq samples for Dmd miE. The bar graph shows the mean PSI ± SD; **** FDR < 0.0001. g, MSSNG-2017, MSSNG-2022 and SFARI gene-set enrichment analysis for mis-spliced genes in 8-month-old DM1 brain organoid (N = 2 in 2 replicas). Points represent the OR and error bars represent the 95% CI. The vertical dashed line represents OR = 1; * FDR = 0.013 and **** FDR < 0.0001. h, Sashimi plot of DM1 (N = 2 in 2 replicas) and WT (N = 2 in 2 replicas) 8-month-old brain organoid RNA-seq samples for DMD miE. The bar graph shows the mean PSI ± SD; **** FDR < 0.0001.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. SznajderSINatNeuroscience.pdf