Mechanisms of mRNA processing defects in inherited THOC6 intellectual disability syndrome

THOC6 is the genetic basis of autosomal recessive THOC6 Intellectual Disability Syndrome (TIDS). THOC6 facilitates the formation of the Transcription Export complex (TREX) tetramer, composed of four THO monomers. The TREX tetramer supports mammalian mRNA processing that is distinct from yeast TREX dimer functions. Human and mouse TIDS model systems allow novel THOC6-dependent TREX tetramer functions to be investigated. Biallelic loss-of-functon(LOF) THOC6 variants do not influence the expression and localization of TREX members in human cells, but our data suggests reduced binding affinity of ALYREF. Impairment of TREX nuclear export functions were not detected in cells with biallelic THOC6 LOF. Instead, mRNA mis-splicing was observed in human and mouse neural tissue, revealing novel insights into THOC6-mediated TREX coordination of mRNA processing. We demonstrate that THOC6 is required for regulation of key signaling pathways in human corticogenesis that dictate the transition from proliferative to neurogenic divisions that may inform TIDS neuropathology.

THOC6 is a subunit of the six member THO (suppressors of the transcriptional defects of hpr1 delta by overexpression) complex, which serves as a core component of the transcription/export (TREX) complex 20 .TREX is critical for licensing export factors required for nuclear pore docking and export of mRNA from the nucleus to the cytoplasm [21][22][23][24] .While this is a conserved function of TREX, there are notable species differences in TREX composition that mirror the evolutionary complexity of mRNA processing requirements.In yeast, TREX is a dimer composed of two ve-subunit THO monomers, with THOC6 being the notable exception.Yeast TREX monomers dimerize via the coiled coil domains of Thp2 and Mft1, the yeast orthologs of THOC5 and THOC7 25 .In humans, TREX is a tetramer, where dimers of THO monomers are tethered by THOC6 25 .The increased size and molecular complexity of the mammalian TREX tetramer correlates with increased mRNA processing demands that have evolved in organisms with higher transcriptome complexity and mRNP composition, namely expression of long genes with high levels of complex splicing patterns 26 .For example, introns comprise ~24% of mammalian genomes 27 .On the level of gene organization, introns are longer and are present in >95% of all human genes [28][29][30] .By contrast, introns constitute only 5% of yeast genes, are short relative to the gene length, and mostly limited to one per gene [29][30][31][32] .Thus, alternative splicing is rare in yeast, but plays a major role in gene expression in mammals, especially humans.
The conservation of THO dimer functions in mammalian cells is an open question.THOC1, THOC3, THOC5, and THOC7 exhibit high probability of loss-of-function intolerance (pLI) in gnomAD and have not been identi ed as the genetic basis of developmental disorders, suggesting conserved THO components are likely embryonic lethal.THOC2 is the genetic basis of an X-linked neurodevelopmental disorder 33 .
Depletion of THO components THOC1-THOC5 and THOC7 lead to strong nuclear export defects 34 .This leads to the speculation that dimers retain mRNP functions in mammals, and that THOC6-dependent tetramer functions enhance the e ciency and coordination of these activities.This would also explain the tissue sensitivity in TIDS, where development is disrupted in tissues that disproportionately express long genes and exhibit elevated isoform diversity.
Unlike the dimer, the TREX tetramer recruits functionally diverse auxiliary factors, that have no yeast orthologs.DDX39A, CHTOP, UIF, LUZP4, POLDIP3, ZC3H11A, ERH, ZC3H18, SRRT, and NCBP3, complex with tetramer TREX and participate in the step of mRNP processing and export, including mRNA 5' capping, splicing, and 3' end processing to create a messenger ribonucleoprotein complex (mRNP) capable of translocating through the nuclear pore complex into the cytoplasm 22,35,36 .Increased TREX complex complexity may represent greater mRNA processing functionality as the complexity of mRNP processing evolved 36,37 .Likewise, the larger complex would allow TREX to serve as a mRNA chaperone to prevent formation of DNA-RNA hybrid or R-loop structures, with the emergence of longer transcripts with elevated splicing that can promote genome instability 38,39 .
THOC6 evolved as a scaffolding protein to create a larger TREX complex, that has implication for mRNA processing coordination in metazoans relative to yeast.Despite the neurological features of TIDS, there is a lack of research in neural cells, cells that undergo extensive changes in mRNA processing during differentiation 40 .Using a series of THOC6 models with clinically relevant pathogenic alleles, we assess essential functions of TREX tetramers functions in mRNA processing in neural development.This experimental approach will also allow THOC6 dependent tetramer function to be evaluated relative to dimer functions.First, we contribute to the THOC6 pathogenic allele series and TIDS clinical phenotypic spectrum, extending the total number of reported THOC6 variants in TIDS to 20 and the total number of reported affected individuals to 34.Second, we generated a Thoc6 mouse model and human induced pluripotent stem cell (iPSC)-derived cell culture models to investigate shared pathogenic mechanisms of mammalian THOC6.Given the high penetrance of microcephaly in TIDS, we focus on mRNA processing/export changes and accompanying phenotypes that occur during cortical development by analyzing primary mouse and dorsal forebrain fated human organoids.We propose a model of unproductive selective mRNA processing/export from partial TREX disruption due to loss-of-function THOC6 alleles leading to dysregulation of proliferation and differentiation.Our ndings reveal a broader supportive role across mRNA processing within the context of THOC6 variants than has previously been attributed to THO.

RESULTS
Biallelic missense and nonsense THOC6 variants are the genetic basis of TIDS While initially detected in Hutterite populations 11 , a growing number of pathogenic biallelic THOC6 variants are being discovered across the globe in individuals of diverse ancestry [8][9][10][12][13][14][15][16][17][18]41 . As part o an ongoing effort to determine the genetic etiologies of syndromic ID, we discovered nine THOC6 variants by exome-based genetic testing (Figure 1A).We con rmed six recurrent variants at W100, G190, V234, and G275 amino acids and identi ed three novel alleles (p.Q47*, p.E188K, and p.R247Q).The clinical phenotype associated with THOC6 variants a rm penetrance of the core clinical features of TIDS, namely global developmental delay, moderate to severe ID and facial dysmorphisms (Figure 1B).Variable expressivity of cardiac and renal malformations, structural brain abnormalities with and without seizures, urogenital defects, recurrent infections, and feeding complications were also noted, clinical features that highlight the multiorgan involvement of this developmental syndrome (Figure 1C).Detailed clinical summaries for all individuals are provided in Table S1 and S2.
The novel THOC6 variants are representative of previously described nonsense and missense variants that contribute equally to the severity of TIDS phenotypes.We describe a nonsense THOC6 c.139C>T, (p.Q47*) variant in exon 2 of proband 1, a missense c.740G>A, (p.R247Q) variant in exon 11 of proband 5, and a missense c.562G>A, (p.E188K) variant in proband 6 (Figures 1A and 1C).THOC6 is comprised of seven WD40 repeat domains (Figure 1D) that form a β-propeller structure when folded.These novel variants, like other clinically relevant THOC6 variants, map to the WD40 repeats that comprise the beta strand structural regions of THOC6 (Figure 1D).The p.Q47* nonsense variant represents a cluster of three THOC6 variants in the rst WD40 repeat that exhibit a consistent genotype-phenotype correlation for both recessive nonsense and missense variants.The same trend is observed for the novel missense variants that are localized to subsequent THOC6 WD40 domains, which support a loss-of-function (LOF) pathogenic mechanism for both THOC6 variant types.
A LOF mechanism is also implicated by the clinical consistency observed between biallelic inheritance of pathogenic THOC6 haplotypes and biallelic inheritance of a single haplotype variant alone.Biallelic inheritance of a triple-variant haplotype (TVH), THOC6 c.[298T>A;700G>C;824G>A], (p. [W100R;V234L;G275D]) has been reported in seven individuals with clinical features of TIDS 9,13,17,42 .The TVH segregates as a founder haplotype in individuals of European ancestry 42 .In comparison, a homozygous THOC6 c.824G>A; p.G275D variant was identi ed in siblings with classic TIDS in family 4 who are of South Asian ancestry.This nding provides additional evidence for the pathogenicity of the TVH THOC6 c.824G>A; p.G275D variant but does not negate the predicted pathogenicity of the corresponding W100R or V234L TVH variants.Pathogenicity of p.W100R or p.V234L are supported by variants detected in WD40 repeats 2 and 4 (Figure 1D).Comparing biallelic inheritance of TVH and THOC6 c.824G>A; p.G275D suggests a single THOC6 variant in both alleles are su cient to comprehensively disrupt THOC6, a baseline de ciency not exacerbated by accumulation of additional LOF variants.

THOC6 variants show mRNA stability with differential effect on protein abundance
To investigate the genetic mechanism of THOC6, the impact of on mRNA nonsense mediated decay (NMD) and protein expression on pathogenicity was tested in embryonic stem cells (ESCs) and iPSCs, collectively referred to as human pluripotent stem cells (hPSCs).hPSCs were reprogrammed from two individuals with TIDS (6:IV:1, THOC6 E188K/E188K and 7:V:2, THOC6 W100*/W100* ) and their respective unaffected heterozygous parent (6:IV:2, THOC6 E188K/+ and 7:V:1, THOC6 W100*/+ ) (Figure 1E), preserving the shared genetic background between affected and unaffected conditions.Consistent with a LOF mechanism, reduction in protein expression due to mRNA nonsense mediated decay was predicted.However, THOC6 mRNA transcripts remain relatively stable between genotypes, as assessed by Actinomycin D treatment where transcription is inhibited, compared to the unstable mRNA, FOS, that is quickly degraded (Figure 1F) 43 .This nding is consistent regardless of THOC6 variant, with the c.299G>A, (p.W100*) and c.562G>A, (p.E188K) variants exhibiting similar decay rates as wildtype transcripts (Figure S1C and S1D).This nding could re ect defective NMD from failure of THOC6affected mRNA to be exported to the cytoplasm.Nevertheless, the impact on protein expression is divergent between nonsense and missense variants.Signi cant reductions in THOC6 abundance were detected in THOC6 W100*/W100* and THOC6 E188K/E188K iPSCs relative to the THOC6 +/+ control, with the most remarkable reduction for THOC6 W100*/W100* .Signi cant abundance differences were also noted for heterozygous unaffected iPSCs relative to wildtype controls (Figure 1G).Full-length THOC6 in THOC6 W100*/W100* samples represent a minority readthrough product.Increased frequency of this rare event was promoted by treatment with 30 mM Ataluren, which extends translation by skipping premature termination codons, leading to an increase in THOC6 detectable by Western blot in THOC6 W100*/W100* iPSCs (Figure 1G).No truncated product was observed by Western blotting in THOC6 W100*/W100* and THOC6 W100*/+ iPSCs, suggesting THOC6 reduction is due to rapid degradation of an unstable, truncated protein.Stable expression from missense THOC6 alleles suggests variant THOC6 is likely functionally inactive.

THOC6 variants interfere with TREX functions
Based on the solved crystal structure 25 , LOF THOC6 variants are predicted to impair TREX tetramer formation.The WD40 repeat domains of THOC6 form beta strands predicted to provide the structural interface for TREX core tetramer formation.Pathogenic THOC6 variants disrupt residues conserved in mammals, though several are not conserved across other metazoan species, mirroring TREX composition variability across species and the evolving function for THOC6 in TREX (Figure 2A).Evaluation of the TREX crystal structure indicates that pathogenic THOC6 variants are positioned at the TREX core tetrameric interface, where THOC5-THOC7 interaction is responsible for dimerization and THOC6-THOC5 interaction tetramerizes the complex (Figure 2B, 2C, and 2D).The allelic series of THOC6 LOF variants implicate a pathogenic mechanism where LOF missense variants are predicted to perturb THOC6 βpropeller folding and/or interactions with THOC5 and THOC7 that are required for TREX tetramer assembly, and nonsense variants would produce a similar outcome due to low protein abundance 25 .Likewise, THOC6 variants do not alter the protein abundance of other THO/TREX members (Figure 2E).Consistent with normal abundance of functional THO protein, subcellular localizations at nuclear speckle domain were observed by immunohistochemistry (Figures S1E and S1F).Conversely, ALYREF association with the THO subcomplex is diminished in THOC6 E188K/E188K and THOC6 W100*/W100* as assessed by coimmunoprecipitation with THOC5 and THOC6 in patient-derived iPSC lines (Figure 2F).These ndings suggest a THOC6-dependent association of ALYREF to THO, with implications for the a nity of other adaptors due to the potential disruption of TREX tetramer formation.

Thoc6 is required for mouse embryogenesis
To investigate Thoc6 pathogenic mechanisms in mammalian neural development in vivo, we used CRISPR/Cas9 genome editing to introduce an insertion variant in mouse Thoc6 exon 1 that resulted in a premature termination codon predicted to ablate Thoc6 expression (p.P6Lfs*8, herein referred to as Thoc6 fs ) (Figure 3A).Thoc6 +/fs male and female mice do not display phenotypic abnormalities, but their intercrosses yielded no homozygous offspring.Analysis at selected embryonic days (E) of gestation con rmed Thoc6 fs/fs littermates die in utero.Differences in embryonic morphology between wildtype (WT) and Thoc6 fs/fs embryos were noted starting at E7.5, the earliest day of analysis.By E9.5, Thoc6 fs/fs embryos were smaller with delayed development; however, the difference in the developing neocortex was particularly pronounced (Figure 3B).No body turning differences were observed.Consistent with embryonic lethality, THOC6 was undetectable in E8.5 Thoc6 fs/fs mouse embryos relative to control littermates (Figure 3C).In E9.5 Thoc6 fs/fs embryos, a developmental timepoint with high Thoc6 expression, Thoc6 is detectable by Western blot at greatly diminished levels (Figure 3C).Embryonic lethality was con rmed by E11.5, indicating one functional allele of Thoc6 is essential for mouse embryonic development (Figures 3C and 3D).
Global mRNA export is not altered in THOC6 models of human in vitro neural development Given the prominent link between the THO subcomplex and RNA export in current literature, we rst sought to investigate the impact of THOC6 variants on RNA nuclear export functions in human neural progenitor cells (hNPCs) differentiated from hPSCs (Figure 4A).NPCs are the embryonic cell population frequently implicated in the developmental mechanism of primary microcephaly, a TIDS clinical feature.
Defects in mRNA export are typically observed as differential accumulation of polyadenylated (polyA+) mRNA in the nucleus, enriched at nuclear speckle domains 44 .Standard oligo-dT uorescent in situ hybridization (FISH) was performed on hNPCs to visualize polyA+ mRNA signal in nuclear and cytoplasmic cellular fractions (Figures S3A, S3B, and S3C).Comparison of the nuclear-to-cytoplasmic (N/C) polyA+ signal intensity ratios across genotypes indicates a slight reduction in THOC6 affected samples, suggesting a trend towards nuclear reduction in affected hNPCs relative to THOC6 +/+ and heterozygous unaffected control hNPCs (Figure S3C).The signi cance of this modest export nding is evident when the N/C polyA+ signal intensity ratios are compared to THOC6 +/+ hNPCs treated with wheat germ agglutinin (WGA), a potent inhibitor of all nuclear pore transport and the positive control for export changes 45 (Figure S3A), Relative to all THOC6 genotypes, WGA treated hNPCs have a signi cantly higher N/C ratio (Figure S3C) attributed to strong polyA+ mRNA accumulation in the nucleus (Figure S3B).Although bulk mRNA export is largely unaltered in THOC6-affected hNPCs, a THOC6 dependent tetramer TREX export cannot be ruled out for speci c polyA+ or in mRNP processing functions upstream of export -TREX functions that have not been explored in hNPCs.
THOC6 depletion reveals TREX function in pre-mRNA splicing in hNPCs Proper mRNP processing, including mRNA splicing, is required for TREX-dependent RNA nuclear export, linking these steps in mRNA biogenesis.Co-transcriptional recruitment of TREX to the 5' end of maturing mRNPs coupled with described TREX associations with splicing factors 46 led us to investigate features of mRNA processing for vulnerability to loss of THOC6.To capture RNA processing differences caused by loss of THOC6 function, we performed RNA-sequencing (RNAseq) on ribosomal (r)RNA-depleted RNA extracted from wildtype and heterozygous unaffected and homozygous affected hNPCs (Table S4).
Principal components analysis (PCA) of RNAseq data demonstrates reproducibility across, and distinct transcriptomic differences between the affected hNPC replicates compared to the unaffected control hNPC replicates (Figure S4A).Genotype driven differential expression and splicing changes were assessed.To investigate a THOC6-dependent role for TREX in splicing, comparative splicing analysis was carried out using the rMATS pipeline on biallelic THOC6 E188K/E188K and THOC6 W100*/W100* samples versus heterozygous controls (Table S5) 47 .A combined total of 3,796 signi cant alternative splicing (AS) events were detected in affected hNPCs, representing the major AS types: skipped/cassette exon (SE), alternative 5' splice site (A5SS), alternative 3' splice site (A3SS), retained intron (RI), and mutually exclusive exon (MXE).The most overrepresented AS events observed in affected cells were SE (56%, 2136 of 3796) and RIs (21%, 784 of 3796).The high frequency of RIs is notable and unique relative to splicing defects identi ed in LOF models of other splicing factors, as well as for normal splicing patterns in neural development (Figure 4B) [48][49][50][51][52][53] .AS events in affected hNPCs show comparable inclusion and exclusion of AS junctions, with a slight trend towards inclusion due in part to the high frequency of RIs (Figure 4B).SE and RI splicing events occur by distinct molecular mechanisms, mediated by EJC pathways.Detection of defects in both splicing categories suggests the THO tetramer serves as a molecular platform for coordinating complex splicing events, as opposed to regulation of a speci c subset of splicing events controlled by association with and function of individual RNA splicing factors.In agreement with this nding, we did not nd consistent motif enrichment for speci c RNA-binding proteins at AS junctions.These ndings implicate a novel role for THOC6-dependent TREX splicing in mRNP processing in hNPCs.
Since AS motif enrichment analysis of THOC6-affected and control hNPCs transcriptomic data did not reveal trans-regulatory elements responsible for the differential splicing patterns, it was posited that ciselements may underlie these differences.A maximum entropy model that assesses short sequence motif distributions was used to test the strength of the donor (5') and acceptor (3') of AS events 54 .A general trend towards weaker splice sites were detected at differential SE, RI, and A3SS events in affected cells (unpaired two-tailed t test, Figure 4C).The SE, RI, and A3SS events were enriched in genes with a disproportionately high number of isoforms that show dependence on weak, alternative/cryptic splice sites to facilitate isoform diversity (Figure 4D) 55 .RI events in affected hNPCs also had weaker splice sites compared to controls, suggesting that THOC6 de ciency induces mis-splicing at weak splice sites.In addition, the AS SE, RI, and A3SS events in affected THOC6 hNPCs impacted exons/introns that are signi cantly longer than nonsigni cant events (Figure 4E).Likewise, the length of introns retained in RI events were signi cantly longer, with a 1.4-fold increase in length quanti ed for signi cant RI events (P= <0.0001, unpaired two-tailed t test, Figure 4E).Lastly, no positional bias was observed for AS events (Figure S4D and S4F).To validate our bioinformatic analysis, AS inclusion trends in select, top, shared events were validated by qRT-PCR, demonstrating a high correlation (unpaired two-tailed t test, Figure 4F).Together, the detected RNA processing signature across diverse SE and RI events at weak splices sites suggest impaired splicing delity from loss of THOC6.
To investigate the role of RNA misprocessing in ID pathology, we intersected our AS events with the genes that are known to cause syndromic ID, deposited in the SysID database (SysIDdb).152 genes with signi cant AS events included or excluded in >10% of transcripts in hNPCS were detected in nonsense and missense affected genotypes (Figure 4G).185 AS genes in THOC6 W100*/W100* and 105 AS genes in THOC6 E188K/E188K hNPCs are known genes causative for syndromic ID represented in the SysIDdb (Figure 4G).Aberrantly spliced ID genes were identi ed in both THOC6 affected genotypes, consistent with a role for THOC6 in ID. 37 ID genes (1.3% of SysIDdb) are AS in both affected genotypes, identifying genes for shared mechanisms that may preferentially contribute to TIDS pathology.To identify biological mechanisms implicated by THOC6-dependent AS, biological pathway enrichment analysis was performed on mis-spliced genes in affected cells.Genes with differential splicing were signi cantly enriched for functions in RNA splicing, cell projection organization, membrane tra cking, organelle organization, mitosis cell cycle, and DNA damage response (Figure 4H).RNA processing is tightly controlled by feedback loops (e.g., auto-repression by poison exons or intron retention), which would explain how effects on cis elements may lead to changes in trans factors (i.e., AS events in splicing regulatory factors).
Retained intron AS events are often subject to NMD in protein-coding genes (PCGs), while intron inclusion in lncRNAs alters nuclear export and conformation.Given the high number of RI events detected in THOC6 affected hNPCs, we tested the correlation between differential expression and RI events.For this analysis, gene-level mRNA abundance fold-change in affected hNPCs was correlated to the change in intron inclusion within transcripts from PCGs or non-coding RNA loci, referred to as percentage spliced-in (DPSI).We observed a trend in both THOC6-affected genotypes that lower gene expression correlates with greater intron inclusion (slope, p = 0.0045 for THOC6 W100*/W100* and p = 0.0002 for THOC6 E188K/E188K , simple linear regression) (Figure 5B).Conversely, the quadrant representing differential intron exclusion and elevated expression was prominently represented by lncRNAs, where intron exclusion is implicated in impaired lncRNA function.The three signi cantly dysregulated ASGs represented in both affected genotypes were MEG3, PAX6, and POSTN.Consistent with the observed trend between RI events and expression, analysis of all DEGs revealed that PCGs make up the largest portion of DEGs (with a greater portion, 96.45% affected in downregulated genes), while the portion of lncRNAs is highest in upregulated genes (6.29%; compared to 1.98% of non-signi cant genes), re ecting the molecular differences between these distinct mRNA subtypes.
Additional mRNA characteristics that may account for a portion of the observed differential expression are gene length and isoform number.In affected hNPCs, signi cantly more transcripts from long genes with on average of less than 10 annotated isoforms were identi ed compared to non-signi cant genes (Figures 5D and 5E).The trend towards DEGs with fewer transcript isoforms in affected hNPCs suggest alternatively spliced transcripts are more stable in affected cells (Figure 5C).These ndings again re ect a requirement for the larger THOC6-dependent TREX tetramer complex function in facilitating mRNP processing of long mRNAs with high expression in brain.
To identify transcription factor networks dysregulated in THOC6-affected hNPCs, transcription factor motif enrichment analyses was performed.Signi cant enrichment of MEF2, LHX3, and SRF target genes was observed in heterozygous controls compared to both affected genotypes (Figure S5D).Using a second analysis tool, ChEA3, differential expression of SOX, FEZF, FOX, and GLI target genes, and downregulation of HEYL, TWIST, FOX, MEOX2, PRRX2, and MKX target genes were enriched in affected hNPCs (Figure S5E).These transcription factor networks are important for neuronal differentiation and fate speci cation 62-64 , concordant with GSEA ndings.Together, these results suggest that gene expression programs that modulate timing of the switch from neural proliferation to differentiation are altered in TIDS.
To re ne speci c candidate genes implicated in shared TIDS neuropathology, DEGs between affected THOC6 genotypes were intersected.12 genes were upregulated and 117 were downregulated in affected hNPCs, with notable lncRNAs represented.Signi cant enrichment was detected in Integrin 1 pathway and extracellular matrix protein interaction networks (Figure S5C).Using mRNA obtained from three additional replicate differentiations of hNPCs per genotype, signi cant upregulation of MEG3, MEG8, ESRG, and NEAT1 lncRNAs was con rmed by qRT-PCR (Figures 5G).RNA FISH con rmed increased expression of MEG3 in affected hNPCs compared to controls, with elevated signal observed in both nuclear and cytoplasmic fractions (Figure 5H).Upregulation of functional lncRNAs NEAT1 and MEG3 has been linked to activation of WNT activation and suppression of TGF-b signaling, respectively 65,66 .Concordant with these ndings, the protein level of WNT and TGF-b signaling components in THOC6-affected hNPCs exhibit a corresponding differential up-and down-expression relative to controls.Speci cally, WNT signaling components WNT7A and TP53 showed increased protein expression, with higher abundance detected in affected hNPCs (Figure 5I).TGF-b pathway proteins HAPLN1 and TGFB2 also showed reduced protein expression in affected hNPCs together with high CEMIP and DKK2 (Figure 5I).We propose that loss of THOC6 leads to lncRNA-mediated dysregulation of key developmental signaling pathways which has implications for the balance of proliferation and differentiation during neural development.
Apoptotic upregulation and retained intron enrichment in Thoc6 fs/fs E9.5 mouse forebrain To investigate the conserved THOC6-dependent TREX functions that account for divergent phenotypic outcomes between mammalian models, mRNP processing was assessed in E9.5 mouse brain using complementary RNAseq experiments to those performed in hNPCs (Figures 6A and S6A).Three biological replicates were analyzed per genotype (Thoc6 +/+ , Thoc6 fs/+ , Thoc6 fs/fs ).Fewer signi cant AS events (FDR <0.05) were detected in Thoc6 fs/fs E9.5 brain than in affected hNPCs, but the pattern of AS events was recapitulated, with the majority of AS events categorized as SE (45%) and RI (26%) (Figure 6B).Greater than 40 PSI was quanti ed in the Thoc6 fs/fs transcriptome and retained and excluded intron events in Cenpt, Admts6, and Fam214b were validated (Figures 6C and S6B).Maximum entropy model analysis of splice junctions revealed signi cantly weaker 3' splice site strengths for SE events and weaker 5' splice sites associated with RI events in the Thoc6 fs/fs mouse model.While this signature of splice site weakness is more modest in mouse than in human THOC6 models, these ndings suggest a conserved role of THOC6-dependent TREX tetramer in coordinating mRNA processing that precedes TREX export functions (Figures 6D and S6C).
Notably, biological pathway and network enrichment analysis of AS genes identi ed mRNA processing, pre-miRNA processing, de-adenylation of mRNA, central nervous system development, forebrain development, multicellular growth, response to oxidative stress, cytoskeletal organization, and neuron projection (Figure S6D) -several of the biological categories associated with hNPCs ASGs.These shared ndings suggest selective conservation of mRNP processing mechanisms by THOC6 in mouse and human forebrain.
Nevertheless, downregulated genes may covey important pathology.First, downregulated genes functionally converge on neurogenesis, proliferation, and differentiation pathways (Figure 6F).Upregulated genes are implicated in the hypoxic response, HIF-1 signaling pathway, and glycolysisbiological categories indicative of increased apoptosis in affected cells (Figure 6F).To investigate if altered transcription factor networks contribute to pathway dysregulation, we performed GSEA transcription factor motif enrichment analysis.HIF1, NRSF, SMAD3, and STAT3 target genes were enriched in Thoc6 fs/fs E9.5 forebrain (Figure S6G).HIF-1 and STAT3 can induce apoptosis in response to hypoxia 67,68 , which is consistent the observed elevation of apoptosis in Thoc6 fs/fs E9.5 neuroepithelium (Figures 3E and 3F).SMAD3 signaling is activated by TGF-b to promote cortical differentiation 58 , suggesting shared disruption of TGF-b signaling in both human and mouse model systems.
DEGs shared between mouse and human model systems are consistent with conserved TIDS molecular pathology (Figure 6G).More Thoc6 fs/fs DEGs overlapped with THOC6 W100*/W100* (23 genes) than THOC6 E188K/E188K (9 genes) samples, and include genes involved in neurogenesis, hypoxic response, and synapse regulation.Validation of Ier3, Islr2, Wnt7a, Kcnt2, Anax2, and Vegfa DEGs shared across affected models were con rmed by qRT-PCR in three additional E9.5 forebrain biological replicates for Thoc6 +/+ , Thoc6 fs/+ , and Thoc6 fs/fs samples (Figures 6E and S6H).Overlapping affected human and mouse molecular mechanisms suggest shared pathology.However, the extent of upregulation of genes in response to increased apoptosis is exacerbated in mouse, highlighting species-speci c phenotypic differences due to loss of THOC6.
Delayed differentiation and elevated apoptosis in THOC6-affected forebrain organoids THOC6 pathogenesis in human cortical development was investigated using dorsal forebrain-fated organoids, neural differentiated from iPSC lines (Figure 7A).Forebrain organoids recapitulate the cellular heterogeneity and developmental dynamics of early corticogenesis 69 .Within each organoid, several neural rosette (NR) structures develop stochastically to recapitulate features of in vivo ventricular zone development, including hNPC proliferation and differentiation to cortical neuron fates (Figure 7A).NR morphology was evaluated in cortical organoids at 28 days of neural differentiation (ND) from three independent differentiations per genotype.To minimize the effect of inter-cell line NR variability, the following analyses focus on heterozygous unaffected and homozygous affected comparisons.7D).
To assess alterations in the timing of differentiation in affected NRs, we performed EdU-pulse labeling at day 21ND for 24 hours to label mitotically active cells, followed by organoid immunohistochemistry analysis at day 28ND (Figures 7E, 7F, S7C, and S7D).To assess the balance of multipotency and differentiation EdU, KI67, and DCX co-labeled cells per NR were quanti ed.A signi cant increase in cells co-stained with the proliferation marker KI67 and EdU per affected NR were detected at day 28ND (p = <0.0001,two-tailed t test), indicating affected NPCs remain mitotically active longer than control NPCs (Figure 7F).This nding paired with elevated mRNA and protein expression of OCT4 in affected hNPCs (data not shown) supports retention of multipotency model.Consistent with this nding, we observed a signi cant reduction in the fraction of EdU cells co-labeled with the migrating neuron marker doublecortin (DCX) in affected NRs (p = <0.0001,two-tailed t test) (Figure 7F).Paired with the prolonged proliferation dynamics, this suggests a disruption to the differentiation timeline in affected organoids.
To investigate effects of reduced NR growth on organoid size, we measured whole organoid cross section areas weekly from day 21 to day 42.Compared to the steady size increase of THOC6 +/+ organoids, affected organoids showed a slower growth rate (E188K/E188K: p = 0.0122; W100*/W100*: p = 0.0362) (Figure 7G).Together, our ndings implicate a pathogenic mechanism of delayed differentiation due to reduced NPC proliferative capacity and elevated apoptosis with subsequent cortical growth impairment in affected organoids.
The resulting molecular impact preferentially disrupts processing of long mRNAs with a high incidence of intron retention.The neurological features of TIDS and the enrichment of long genes expressed in the brain further implicate a critical role for the THOC6-dependent TREX tetramer in providing a platform for enhanced coalescence of mRNP processing cofactors and maintenance of mRNA structural integrity during splicing of long mRNA transcripts important for brain development.Cell-type and organism speci c requirements for splicing and gene length are predicted to inform the variation in tolerance that underlies distinct human phenotypic presentations and interspeci c differences.
Our analysis of the THOC6 alleles, carrying either nonsense and missense variants, indicates pathogenic mRNA transcripts do not undergo NMD.Stable THOC6 nonsense mRNAs correlate with a notable reduction in full length THOC6 expression, with no evidence for expression of a truncated product.Very low THOC6 expression in homozygous THOC6 nonsense cells is consistent with the expression of a minimal readthrough product, by a nonsense suppression mechanism observed in both human and mouse cells.THOC6 missense variants do not trigger NMD or affect protein abundance.The shared clinical phenotypes between biallelic THOC6 nonsense and missense variants indicates that THOC6 variants generally function through a LOF genetic mechanism, with both low protein abundance and normal expression of variant protein disrupting TREX tetramer complex formation.Given the conserved functions of TREX in mammals, the phenotypic discrepancy between Thoc6 fs/fs mouse embryonic lethality and human biallelic THOC6 TIDS features is notable and suggests species-speci c aspects of THOC6 pathogenic mechanisms.Super cially, this nding suggests that humans are more tolerant to THOC6 variants.Alternatively, this may re ect divergence in the downstream genetic mechanisms of the human and mouse variants investigated.Human pathogenic THOC6 variants reside in the WD40 repeat domains whereas the mouse Thoc6 fs variant is located upstream of the rst WD40 repeat domain.Nonsense THOC6 variant transcripts are stable and readthrough permits limited THOC6 expression.
Conversely, Thoc6 fs transcripts are subject to NMD, impairing THOC6 expression, and readthrough causes low expression of full-length THOC6 protein at restricted developmental timepoints.This suggests that minimal THOC6 expression may be su cient for human embryogenesis.
TREX has a prominent role in nuclear RNA export 46,76 .As such, a signi cant reduction in global polyA+ RNA nuclear export was predicted to result from loss of THOC6-mediated TREX function.However, the absence of global export defects suggests that THO dimers in THOC6-affected hNPCs maintain their conserved function for RNA export.Given this nding, alternative THOC6 pathogenic mechanisms may represent tetramer functions.The signi cant splicing changes implicate a pathogenic mechanism where THOC6-dependent disruption of TREX tetramer formation indirectly disrupts coordination of multiple steps of mRNP processing, including splicing, upstream of polyA+ mRNA packaging and export.This interpretation is supported by the diversity of mRNP processing functions attributed to tetramerassociated cofactors.UAP56 and ALYREF play important roles in mediating pre-mRNA splicing decisions 46,87 .This nding does not rule out the possibility that THOC6 plays a direct role in pre-mRNA splicing outside of mediating TREX core tetrameric assembly on the EJC.That THO member THOC5 can interact with unspliced transcripts 34 , and WD40-repeat domains facilitate splicing factor interactions with pre-mRNA 48 , are evidence in support of this possibility.
The crystal structure of human THO-UAP56 was recently solved, implicating several putative consequences of THOC6-dependent TREX tetramer disruption (Figure 7H) 25 .Defects in TREX tetramer assembly are not predicted to disrupt formation of stable functional dimers, allowing THOC6-depleted models to discriminate between dimer and tetramer functions.The tetramer affords greater surface area for mRNP processing and permits enhanced co-adaptor loading of known and potentially species-speci c TREX cofactors in species with a substantial splicing burden.Certainly, we see evidence for reduced association of ALYREF with THO complexes by THOC6 co-immunoprecipitation in THOC6-affected iPSCs, providing indirect evidence for altered TREX tetramer formation with impaired binding of tetramerassociated mRNP processing cofactors (Figure 2F).Tetramer formation juxtaposes two UAP56 helicases on each end of dimers 88 to support a greater number of cofactors with broad mRNP processing capabilities (e.g., CHTOP in 3' end processing and ALYREF in splicing) 46 known to compete for the limited number of UAP56 binding sites 36,89,90 .The tetramer also affords a greater surface area to maintain the structural integrity of long mRNA transcripts during mRNP processing and export.Through this process, TREX serves as a mRNA chaperone to prevent formation of DNA-RNA hybrid or R-loop structures that can promote genome instability 38,39 .Thus, the TREX tetramer helps ensure mRNP quality control and evasion of degradation by the nuclear exosome 91 .The tetramer may also enable multiple TREX complexes to simultaneously bind several mRNP regions 25 to facilitate compaction and/or protection of longer transcripts with elevated splicing.
Enhanced mRNA processing e ciency provided by UAP56, ALYREF, CHTOP, and other export adaptors associated with the TREX tetramer is susceptible to THOC6 pathogenic variants.Likewise, the diversity of expressed isoforms increases phylogenetically (yeast versus mammals), as well as during mammalian differentiation 29,30,93 , especially in the brain, highlighting the vulnerability of neural development to THOC6 loss 40,94 .Neuronal expression of long genes is imperative for proper neuronal differentiation and synaptogenesis 95 .ALYREF is predicted to interact with other UAP56/TREX-bound adaptors associated to the same mRNP 25 , suggesting that the tetramer facilitates increased mRNP compaction required for proper export, especially of longer mRNAs.It is possible that enhanced selectivity of mRNP processing and export co-evolved with TREX conformation, contributing to organismal complexity and distinguishing mammalian and yeast cells.
Our ndings implicating THOC6-dependent TREX tetramers as indirect facilitators of splicing by coordinating the mechanics of mRNP processing is also supported by enrichment of aberrant splicing events at weaker splice sites.Weak splice sites are most often utilized by transcripts during alternative splicing, and genes with elevated isoform diversity from alternative splicing are more susceptible to disruption of the overall integrity of mRNP processing in THOC6-affected hNPCs.Long genes that are highly expressed in the brain particularly rely on such infrastructure to ensure pro-neural gene expression.
This may also be supported by the observed enrichment of retained introns in THOC6-de cient cells.
Indeed, retained introns in unaffected tissues typically have weaker 5' and 3' splice sites compared to other splice junctions 96 .In addition, intron retention increases during mammalian differentiation 97 , again suggesting differentiated cells may be more susceptible to loss of THOC6-mediated TREX tetramer functions by miscoordination affecting splicing outcomes.
Differences in splicing requirements may also account for interspeci c phenotypic differences.While SE and RI events were the predominant splicing defects in both model systems, AS events affect more protein coding transcripts and lncRNAs in THOC6-affected hNPCs.Despite less AS events in Thoc6 fs/fs cells, lethality in the mouse could be attributed to aberrant alternative splicing of speci c transcripts important for mouse embryogenesis.Additionally, previous ndings indicate that RI events account for a substantial portion of splicing variation in the primate prefrontal cortex, a trend that is most pronounced in humans 98 .Although intron retention is a known mechanism of mouse neuronal gene regulation by initiating RNA exosome-mediated degradation 99 , it is possible that human cells are more tolerant than mouse cells to elevated intron retention.Further investigation of these interspeci c differences is important for generating translationally relevant discoveries.
The number of ID genes that are mis-spliced in THOC6 affected hNPCs relative to controls implicate shared underlying developmental mechanisms of ID pathology.However, developmental impact of individual defects on TIDS neuropathology is complicated by the compounding effects of constitutive THOC6 LOF models.In addition to trends shared with the mouse model, we show that biallelic THOC6 LOF is responsible for disruption of key TGF-β and Wnt signaling pathways via a mechanism that involves dysregulation of signaling components and lncRNAs resulting in delayed hNPC differentiation, prolonged retention of multipotency, and enhanced apoptosis.This is exempli ed by intron retention and upregulation of MEG3 in affected hNPCs.MEG3 is linked to the regulation of TGF-b signaling and other EZH2 common target genes 66 .Our ndings suggest that RI events alter MEG3 subcellular localization, expression, and downstream WNT signaling that increases multipotency and disrupts the balance of proliferation and differentiation in affected hNPCs.A shift towards cytoplasmic localization of lncRNAs has evolved in human cells, which is important for the maintenance of stem cell pluripotency (e.g., cytoplasmic FAST binds E3 ubiquitin ligase b-TrCP to block its interaction with b-catenin and enable activation of Wnt signaling) 100,101 .Given the increased diversity of lncRNA functions in human developmental biology, mouse cells may be less tolerant to lncRNA dysregulation than human cells.In addition, MEG3 is also upregulated by CREB 102 whose target genes are affected in Thoc5 conditional knockout mouse cortical neurons 82 , potentially re ecting a shared mechanism of THO dysregulation in neural cells.While our analyses from mouse and human organoid models of Thoc6 and THOC6 disruption provide insight into the molecular pathology of early neural development, later analysis of synaptic physiology will be important to elucidate mechanisms of neuronal dysfunction in TIDS.
Altogether, our ndings expand the TIDS clinical population and provide novel functional insight into the pathogenic mechanisms of biallelic LOF variants in THOC6 using comparative mammalian model systems.Functional studies with THOC6 enable us to assess TREX tetramer function while retaining THO subcomplex formation, and our ndings provide novel insight into TREX splicing functions separate from export.Future work is needed to dissect the direct and indirect effects of THOC6 loss and con rm endogenous tetramer disruption under native protein conditions.In addition, the well-known role of several TREX members in determination of polyadenylation site choice necessitates research focused on characterizing global aberrant alternative polyadenylation changes that could be contributing to dysregulation.Follow-up investigation at later-stage cortical organoids, and with use of unbiased singlecell RNAseq pro ling, will allow for more detailed assessment of the developmental consequences of observed defects for cortical lamination and cell type composition.Lastly, it will be important to determine if alterations in mRNA processing/export also underlie synaptic defects-a morphological basis of ID, and a prominent clinical feature shared between THOC2 and THOC6-associated neurodevelopmental disorders.

Human subjects
All subjects or parents/guardians provided informed consent and were enrolled in institutional review board-approved research studies.In all cases, the procedures followed were in accordance with the ethical standards of the respective institution's committee on human research and were in keeping with international standards.Probands 1-3 and 5 were identi ed through GeneMatcher 103 .Details for all subjects are provided in Table S1.

Animal models
All mice were maintained according with the National Institutes of Health Guidelines for the Care and Use of Laboratory Animals and were approved by the Case Western Reserve Institutional Animal Care and Use Committee.CRISPR genome editing was performed in the University of California, San Diego Transgenic and Knockout Mouse Core.C57BL/6JN hybrid mice (Jackson Laboratory, 005304) were used for CRISPR editing of the Thoc6 locus.Founder mice with the Thoc6 fs/+ allele were intercrossed with C57BL/6JN mice (Jackson Laboratory, 005304) for line maintenance.All ex vivo analyses were performed on tissue collected from mice of both sexes at embryonic day (E) 8.5-10.5.Sex-dependent differences were not assessed.
Litters were genotyped by allele-speci c polymerase chain reaction (AS-PCR).Genomic DNA was prepared from mouse tissue samples as previously described 104 .AS-PCR for each allele was assembled using the standard GoTaq DNA polymerase (Promega) protocol.Reaction conditions were executed as recommended by the manufacturer.Primers and sgRNA sequences are provided in Table S3.

Whole exome sequencing and analysis
Exome libraries from genomic DNA of all BBIS-affected probands were prepared and captured with the Agilent SureSelectXT Human All Exon 50Mb Kit for Probands 1 & 4-7 and Individual 5, the Agilent SureSelectXT Clinical Research Exome kit for Proband 3, and the TrueSeq Rapid Exome Kit for Proband 2.
Further, exome libraries were sequenced on an Illumina HiSeq or NextSeq instrument as described previously 105 .
Reads were aligned to the human reference genome NCBI builds 37 (GRCh37) and 38 (GRCh38) and 38 using Burrows-Wheeler Aligner (BWA) 106 .Variant calling of single nucleotide variants (SNVs) and copy number variants (CNVs) was performed using GATK 107 , VEP, and CoNIFER 108 .Average depth of coverage was calculated across all targeted regions.The data were ltered and annotated from the canonical THOC6 transcript (ENST00000326266.8 and ENSP00000326531.8)using in-house bioinformatics software 109 .Variants were also ltered against public databases including the 1000 Genomes Project phase 311, Exome Aggregation Consortium (ExAC) v.0.3.1,Genome Aggregate Database (gnomAD), National Heart, Lung, and Blood Institute Exome Sequencing Project Exome Variant Server (ESP6500SI-V2).Those with a minor allele frequency >3.3% were excluded.Additionally, variants agged as low quality or putative false positives (Phred quality score 14; 15 < 20, low quality by depth <20) were excluded from the analysis.Variants in genes known to be associated with ID were selected and prioritized based on predicted pathogenicity.

Sanger sequencing
All variants discovered by WES were con rmed with Sanger sequencing of THOC6 for each individual and respective family members who submitted samples except Proband 1 where high-coverage WES of THOC6 in the proband and parents was deemed su cient to report without Sanger con rmation.Chromatograms were analyzed using NextGENe, Sequencer, and Geneious Prime Software (v.2022.1.1).

Cerebral organoid generation
Telencephalic cerebral organoids were generated based on previously published protocols 110 , with few modi cations to start with low cell density in order to generate smaller and more consistent embryoid bodies (EBs).Brie y, human and chimpanzee iPSCs were passaged into 96-well V-shaped bottom ultralow attachment cell culture plates (PrimeSurface® 3D culture, MS-9096VZ) to achieve a starting cell density of 600-1,000 cells per well in 30 µl of mTesR TM 1 with 1 nM ROCK inhibitor.After 36 hours, 150 µl of N-2/SMAD inhibition media (cocktail of 1X N-2 supplement (Invitrogen 17502048), 2 μM A-83-01 inhibitor (Tocris Bioscience 2939), and 1 mM dorsomorphin (Tocris Bioscience 309350) in DMEM-F12 (Gibco 11330032)) was added for neural induction.On day 7, EBs were transferred to Matrigel-coated plates to enrich for neural rosettes at a density of 20-30 EBs per well of a 6-well plate, and media was changed to neural differentiation media (0.5X N-2 supplement, 0.5X B-27 supplement (Invitrogen 17504044) with 20 pg/ml bFGF and 1mM dorsomorphin inhibitor in DMEM/F-12).For organoid differentiation EBs were outlined on day 14 using a pipet tip and uplifted carefully with a cell scraper to minimize organoid fusion and tissue ripping.Media was changed once more to N-2/B-27 with bFGF only and plates with uplifted organoids were placed on a shaker in the incubator set at a rotation speed of 90.

Western blot analysis and immunoprecipitation
ESCs, iPSCs, and NPCs used for western blot analysis were pelleted and lysed in RIPA buffer supplemented with 1:50 protease inhibitor cocktail (Sigma-Aldrich P8340) and 1:100 phosphatase inhibitor cocktail 3 (Sigma-Aldrich P0044) using mortar and pestle coupled with end-over-end rotation for 30 minutes to 1 hr at 4°C.Protein concentration was quanti ed by BCA (Thermo Scienti c Pierce A53227).Lysis samples were then incubated at a 1:3 ratio with 4x Laemmli sample buffer (Bio-Rad) supplemented with 10% BME and incubated at 95°C on a heat block for 5 minutes for denaturation.For co-immunoprecipitation, primary antibodies anti-THOC5 and anti-THOC6 (1:50 dilution in 1x PBS with Tween-20) were incubated overnight at 4°C with Dynabeads Protein G (Invitrogen, 10003D).Beads were washed and cell lysis (35 μg of protein) was added for incubation overnight at 4°C with rotation.IP samples were prepared according to manufacturer's instructions with elution in Laemmli sample buffer with 10% BME.For promotion of readthrough of premature termination codons, ataluren (eMolecules NC1485023) was dissolved in DMSO added to ESC/iPSC media at a nal concentration of 30 mM for 48 hours as previously described 111 .Protein was then extracted as described above.
Samples were loaded into 4-20% SDS-polyacrylamide gels (Bio-Rad) and proteins were separated by electrophoresis at 30V for ~4 hours room temperature.Separated proteins were then transferred to PVDF membranes (Millipore) overnight using a wet transfer system (Bio-Rad) at 4°C.For immunoblotting, membranes were incubated in 5% milk blocking buffer (1x TBS-T) followed by primary antibody incubation overnight at 4°C with rotation.Membranes were washed 3 times for 5 minutes in 1x TBS-T and then incubated with secondary antibodies for 1-2 hours at room temperature.Membranes underwent nal washes before developing using West Femto Substrate (ThermoFisher 34095) with lm exposure.

Immuno uorescence and Single-molecule Fluorescence in situ Hybridization
Human NPCs were xed in 4% paraformaldehyde (PFA) for 20 minutes.Human cortical organoids and mouse embryos were xed in 4% PFA for 24 hours at 4°C, cryoprotected in 15% and 30% sucrose in 1x DPBS for 24 hours at 4°C, then embedded in OCT with quick freezing in -50°C 2-methylbutane, followed by cryosectioning for immunostaining.Mouse embryos were sectioned at 13 µm and organoids at 16 µm.
Samples for immunostaining were incubated for 1 hour with blocking buffer (5% NDS (Jackson ImmunoResearch) 0.1% Triton X-100, 5% BSA) at room temperature, then overnight with primary antibodies diluted in blocking buffer at 4°C, and for 1-2 hours in secondary dilution at room temperature.
Washes performed in PBS.For nuclear staining, samples were incubated at room temperature for 10 minutes in Hoescht or DAPI (1:1000 dilution in PBS) prior to nal washes.For EdU labeling detection, the Click-IT EdU imaging kit (Invitrogen C10337) was used according to the manufacturer's instructions.After incubation with the Click-IT reaction cocktail, sections were blocked and immunostained as described above.Some antibodies required antigen retrieval via incubation in heated 10 mM sodium citrate solution Embryos and organoids used for RNA Fluorescence in situ Hybridization (FISH) were xed and cryoprotected as indicated above using RNAse-free PBS.RNAse-zap treatment of sectioning equipment was performed prior to cryosectioning.NPCs for RNA FISH were xed in RNAse-free 4% PFA then permeabilized in PBS-TritonX (0.1%) for 15 minutes.Hybridizations were then performed overnight at 37°C with a nal concentration of 2 ng/µl of Cy3-conjugated oligo-dT(30-mer) probe, MALAT1 (Quasar-670, Stellaris VSMF-2211-5), and/or MEG3 (Quasar-570, Stellaris VSMF-20346-5).Saline-sodium citrate washes were performed before and after hybridization, followed by nuclear staining with RNAse-free Hoescht-PBS wash (1:1000 dilution) and nal wash in RNAse-free PBS.
Glass covers were mounted onto all slides with Prolong Gold (Molecular Probes S36972) and incubated for 24 hours at room temperature prior to imaging.Imaging was performed with a Nikon A1ss inverted confocal microscope using NIS-Elements Advanced Research software.Image analysis was performed using Fiji (ImageJ) software 112 .For oligo-dT FISH, Z-series images were taken every 0.2 mm across entire width of cells for each genotype using same laser intensity settings and collapsed by max intensity using Z project tool in Fiji for quanti cation of nuclear and cytoplasmic fractions of polyA intensity by automated quantitation with CellPro ler (v4.2.1).Hoechst signal was used to segment nuclei and the oligo-dT signal to segment cell body.Three differentiation replicates per genotype.3D surface plots were made in Fiji.

WGA inhibition of nuclear export
Con uent NPCs were incubated with digitonin at 30 mg/mL diluted in DMSO and WGA conjugated to Alexa Fluor 488 (Invitrogen, W11261) at 5 mg/mL diluted in DPBS for 5 minutes, as previously described 45 .Cells were washed to remove digitonin and WGA only was added to media at 5 mg/mL for 1 hour.Control NPCs were only treated with digitonin.Cells were xed and prepped for oligo-dT FISH as described above.

RNA sequencing and bioinformatics analysis
Total RNA was extracted from cultured hNPCs (two biological replicates per genotype) using TRIzol Reagent (Invitrogen 15596026) followed by DNAse column treatment using PureLink RNA extraction kit (Invitrogen 12183018A).Total RNA from dissected E9.5 mouse forebrain tissue (three biological replicates per genotype) was extracted using Picopure RNA isolation kit (Applied Biosystems KIT0204) according to manufacturer's recommendations.hNPC and E9.5 mouse forebrain RNA samples were ribodepleted followed by 151 bp paired-end sequencing on the Illumina NovaSeq 300 cycle, ~20-30 million reads per sample.Library preparation and sequencing was conducted by the Advanced Genomics Core (AGC) at the University of Michigan.ERCC spike-ins (Invitrogen 4456740) were added for sequencing controls at starting concentrations according to the manufacturer's instructions.FASTQ les were trimmed with Cutadapt v4.1 using default parameters 113 .Read quality was assessed by FASTQC v0.11.9 114 .MultiQC v1.7 115 was used to visualize FASTQC outputs and compare samples.ERCC spike-in FASTA and GTF annotation les were merged with human GRCh38.p13reference genome FASTA with GTF release 39 or mouse GRCm39 reference genome FASTA with GTF release M28.FASTQ reads were then mapped to merged les using STAR alignment with parameter '--outSAMtype BAM SortedByCoordinate' 116 .Count analysis was performed on sorted BAM les using RSEM with paired-end alignment speci ed 117 .Differential expression analysis was carried out using DESeq2 v1.34.0 118 in R v4.1.2 119.ERCC spike-in counts were used to estimate size factors for each sample for DESEq2 analysis.
Genes were considered dysregulated if FDR < 0.05 and fold-change > 2 or < -2.Volcano and PCA plots were made using ggplot2 and pcaExplorer packages in R.
Alternative splicing analysis was performed on sorted BAM les using rMATS v4.1.2 120with the following parameters: '-t paired --readLength 150 --variable-read-length --nthread 4' 47 .AS events were called if FDR < 0.05 and ΔPSI > 10%.Events with less than 5 average reads were ltered out using the MASER package in R 121 .To calculate splice site strength at 5' and 3' splice sites in AS transcripts identi ed by rMATS, maximum entropy modeling was carried out using MaxEntScan 54 .The required input is a 9-mer sequence at 5' splice sites (3 bases in exon and 6 bases in downstream intron) and a 23-mer at 3' splice site (20 bases of intron and 3 bases of downstream exon).Scores were plotted in GraphPad Prism (v9.3.1).DAVID (david.ncifcrf.gov/tools) 122and Metascape (metascape.org) 123analyses were performed to identify enriched biological pathways based on Benjamini-Hochberg multiple hypothesis corrections of the p-values.To identify potential transcription factors responsible for expression differences, Gene Set Enrichment Analysis (GSEA v4.2.3) against the MSigDB transcription factor motif gene set (c4.tftv7.5.1.symbols.gmt)and ChIP-X Enrichment Analysis v3 (ChEA3) were performed 124  biological replicates per genotype (independent differentiations for NPCs).Melt curve analysis was performed on all primers to ensure temperature peaks at ~80-90°C.GAPDH and FOS primer sequences were obtained from 43 .NEAT1 was obtained from 65 and MEG3 was obtained from 66 .All others were designed using NCBI primer blast.Primer sequences provided in Table S3.
mRNA decay analysis was performed using transcription inhibition by Actinomycin D (ActD) based on 43 .Human ESCs/iPSCs were rst passaged into ve 12-well plates.Each plate had the following lines: THOC6 +/+ (H9 ESCs), THOC6 +/+ (AS0041 iPSCs), THOC6 W100*/W100* , THOC6 W100*/+ , THOC6 E188K/E188K , THOC6 E188K/+ .Once con uent, ActD was added to media of all four plates at 10 mg/mL (Sigma-Aldrich A9415).After 30 minutes, media was removed from one plate and 1 mL of TRIzol Reagent was added directly to each well (t = 0).Cells were uplifted in TRIzol by pipetting and transferred to a fresh tube.Tubes were immediately frozen in TRIzol at -80°C.This was repeated every 30 minutes to obtain the following time points 30 minutes post-ActD treatment: t = 0.5, 1, 1.5, and 2 hrs.Extractions were performed in batches per time point based on protocol described above.Standard curve analysis was performed to validate primers (Figure S1C).This experiment was repeated to capture longer decay window using the following time points: t = 0, 2, 4, 8 (Figure S1D).ΔΔCt values obtained by subtracting mean t = 0 ΔCt for each genotype.Abundances of THOC6, GAPDH, and FOS (positive control for rapid decay) mRNA was determined.

Quanti cation and statistical analyses
Statistical signi cance of all quanti cations from microscopy images, western blot images, gel images, and qRT-PCR abundances was tested using a student's two-tailed t-test and data was plotted using GraphPad Prism (v9.3.1) as mean ±SEM or mean ±SD, as speci ed in gure legends.Simple linear regression was performed in qRT-PCR standard curve analysis, organoid growth curves, and intron retention analysis.Statistical signi cance of gene overlaps were tested using Fisher's exact test via GeneOverlap package function testGeneOverlap() in R. Benjamini-Hochberg multiple hypothesis corrections were performed in pathway enrichment analyses.

DECLARATIONS DATA AVAILABILITY
Data generated in this study are provided in Supplemental Information.This study did not generate new unique reagents.This paper does not report original code.RNAseq data have been deposited on GEO and are publicly available as of the date of publication.Further information and requests for resources and reagents should be directed to and will be ful lled by the lead contact, Stephanie L. Bielas (sbielas@umich.edu).

Figures
Figures

Figure 7
Figure 7 . EnsemblBioMart tool (http://useast.ensembl.org/biomart)was used to obtain coding sequence length, transcript number per gene, gene type, and sequences for AS events.The GeneOverlap v1.32 R package was used to identify overlapping DE and AS hits between affected genotypes.Primary and candidate syndromic ID genes were obtained from the SysID database (https://www.sysid.dbmr.unibe.ch).
(Alkali Scienti c inc., QS1020) according to manufacturer's instructions.Cycler parameters used: cDNA activation (1 cycle at 95°C for 2 minutes), denaturation (40 cycles 95°C for 5 seconds) and annealing/extension (40 cycles at 60°C for 30 seconds).The ΔΔCt method was used to analyze data with GAPDH as a reference gene.ΔΔCt values obtained by subtracting mean THOC6 +/+ ΔCt values for each sample.Data shown represent mean values of three qPCR technical replicates per sample for three