Identification of sleep and circadian alternative polyadenylation sites associated with APA-linked human brain disorders

Sleep and circadian rhythm disruptions are comorbid features of many pathologies and can negatively influence numerous health conditions, including degenerative diseases, metabolic illnesses, cancer, and various neurological disorders. Genetic association studies linking sleep and circadian disturbances with disease susceptibility have mainly focused on changes in gene expression due to mutations, such as single-nucleotide polymorphisms. Thus, associations between sleep and/or circadian rhythm and alternative polyadenylation (APA), particularly in the context of other health challenges, are largely undescribed. APA is a process that generates various transcript isoforms from the same gene, resulting in effects on mRNA translation, stability, localization, and subsequent function. Here, we have identified unique APAs in rat brain that exhibit time-of-day-dependent oscillations in expression as well as APAs that are altered by sleep deprivation and the subsequent recovery period. Genes affected by APA usage include Mapt/Tau, Ntrk2, Homer1A, Sin3band Sorl. Sorl1 has two APAs which cycle with a 24 h period, one additional APA cycles with a 12 h period and one more that is reduced during recovery sleep. Finally, we compared sleep- or circadian-associated APAs with recently described APA-linked brain disorder susceptibility genes and found 46 genes in common.


INTRODUCTION
Dysregulation of sleep and circadian rhythms can profoundly impact human health and compound disease 1,2 .Indeed, sleep disruption is associated with negative outcomes in cardiovascular, metabolic, immunologic, and cognitive health that can have substantial short-and long-term consequences 3 .
Alterations in sleep and circadian rhythms are often observed with various brain disorders, including autism spectrum disorder, bipolar disorder, major depression, schizophrenia, Parkinson's, and Alzheimer's diseases [4][5][6][7] .Complicating the association between sleep and health is the fact that functional aspects of sleep remain largely unde ned and inconclusive 8,9 ; however, the use of evolutionarily distinct animal models to study sleep has historically offered keen insights 10,11 .For example, studies on circadian-and sleep-dependent gene-regulatory mechanisms in diverse species, including ies, rodents, and humans, have identi ed important phylogenetically conserved pathways with functional relevance [12][13][14] .Employing unbiased approaches, such as large-scale metabolomic, transcriptomic, and proteomic analyses, have also greatly aided in the generation of conceptual frameworks for characterizing sleep function in health 14,15 .Therefore, performing such discovery-based studies of sleep and circadian regulatory processes in model organisms will help de ne the fundamental biological mechanisms underlying sleep function and inform pre-clinical relevance for comorbidities of sleep dysfunction associated with poor health.
Alternative polyadenylation (APA) site usage is an important and often overlooked mechanism of gene regulation, that can affect mRNA stability, mRNA/protein targeting, translational competence, and generate alternative protein isoforms 16,17 .APA sites are common and occur most frequently in the 3′ untranslated region (3′ UTR) of mRNAs across phylogeny, with more than half of human genes having multiple polyadenylation sites (PASs) that generate alternative isoforms 18 .These isoforms can have altered coding sequences or 3'UTRs, resulting in the diversi cation of cis-regulatory elements (e.g., RNA binding protein sites, microRNA binding sites) that in uence transcript abundance, tra cking, stability, and/or translation e ciency 19 .Furthermore, there's growing evidence of cell-type-speci c APA preference 20 .The involvement of APA in the context of sleep and circadian rhythms has been largely unexplored, with the few studies available mostly focused on peripheral organs 21,22 and cells 23 .Here, we have characterized how APA site usage oscillates based on the time of day as well as how it is altered following acute changes in sleep pressure, speci cally in the adult mammalian brain.Multiple methodologies have been developed for transcriptome-wide pro ling and mapping of APA sites 24,25 .To complete this study, we performed whole transcriptome termini sequencing (WTTS-seq) 26,27 analysis to pro le the variations in APA usage that occur due to sleep pressure and daily rhythms in the rat forebrain.
Over 31,000 PASs were recovered in total, with 45% of the represented genes having multiple APA sites.Interestingly, many of the PASs sequenced were not previously annotated in the rat genome.Moreover, a total of 2,011, (6%) of PASs cycled over the day, and 831 (3%) were homeostatically regulated following sleep loss following sleep loss or during recovery.Over half of all cycling or differentially expressed PASs were APAs, (i.e., in genes with ≥ 2 PASs).Given the importance of sleep [4][5][6][7] and APA in health and disease 24,28,29 , we compared our sequencing results with results from a recent study that determined APA usage in human brain disorder susceptibility 30 .The genes found in both studies warrant further examination and could lead to new rodent models to investigate these disorders.
To the best of our knowledge, the current study represents the rst comprehensive, transcriptome-wide mapping of APA sites in adult mammalian brain tissue over the day-night cycle as well as following changes in sleep homeostasis.This global temporal dataset will be useful for future comparative studies that require the determination of baseline APA site usage pro les in the mammalian brain.Furthermore, our study underscores the importance of using alternative-omic approaches to characterize phylogenetically conserved genome-phenome information and reveals another expansive layer of complexity in sleep and circadian gene regulation that has not previously been documented.

METHODS
Subjects.All animal procedures were carried out in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and ARRIVE and OLAW guidelines and approved by the WSU Institutional Animal Care and Use Committee (IACUC; ASAF# 6804).Male Long Evans rats (7-9 weeks old) were housed in pairs at 22 ± 2°C on a 12:12 h light-dark cycle.The rats were acclimated to this light cycle for at least 10 days prior to tissue collection, with water and chow ad libitum.Cages were cleaned weekly (between 8 and 11 AM) unless the rats were being euthanized within 24 h.Thirty rats were randomly assigned to one of six groups (n = 5/group) that were sampled every 4 h, beginning 2 h after light onset (zeitgeber time (ZT)) (i.e., ZT2, 6, 10, 14, 18, and ZT22).For the sleep deprivation (SD) study, twenty rats were randomly assigned to 6 h SD from ZT0-6, wherein rats were kept awake by an automated bedding stir bar (Pinnacle) at the bottom of a cylindrical cage.The bar was set to rotate for 4 s, randomly changing rotation direction, and stopped for a random interval ranging from 10 to 30 s 31,32 .
Following SD, rats (n = 5/time point) were euthanized immediately (R0) by live decapitation, or were returned to their home cage for 2 h (R2), 4 h (R4), or 8 h (R8) under red light without disruption before sampling.Five additional rats were euthanized at ZT8 as undisturbed, time-matched controls.The other time-matched controls with undisrupted sleep (i.e., ZT6, 10, and 14) were taken from the circadian analysis described above.Tissue Collection.Rats were decapitated by guillotine under normal room light (ZT2-10) or under dim red light (ZT14-22).Following decapitation, forebrains were resected (Fig. 1A), frozen in 2-methylbutane suspended in dry ice, and then stored at -80°C until homogenization for RNA extraction.
RNA isolation.Just before RNA isolation, forebrains were removed from − 80°C storage and placed on dry ice.Prior to use, a stainless-steel mortar and pestle were cleaned with RNase Zap (Thermo Fisher) and 70% ethanol.The mortar was then partially lled with liquid nitrogen before a forebrain was added, pulverized, and placed in a conical tube.Between each sample, the mortar and pestle were cleaned with 70% ethanol.A small aliquot of sample was removed for RNA isolation using Trizol Reagent (Invitrogen), according to the manufacturer's instructions.Puri ed RNA was resuspended in water, and concentration and purity were measured with a Nanodrop spectrophotometer (Thermo Fisher).Samples were stored at -20°C until further processing was performed.WTTS-seq libraries were prepared as described by Zhou et al. 26 .Brie y, total RNA (2.5 µg) was incubated at 70°C with 10X Fragmentation buffer (Invitrogen) for 3 min.The fragmentation reaction was halted by the addition of Stop Solution and incubation on ice for at least 2 min.Next, poly(A) + RNA was puri ed from the fragmented total RNA with Dynabeads Oligo (dT)25 (Invitrogen), according to the manufacturer's directions, and used for rst-strand cDNA synthesis in a 20 µL reaction mixture.First, 1.0 µL of barcode primer (100 µM) and 1.0 µL of a common SMART primer (100 µM) were annealed to the poly(A) + RNA template by heating to 65°C for 5 min and incubating on ice for at least 2 min.Next, 4.0 µL of 5X First-strand buffer (Invitrogen), 1.0 µL of SuperScript III reverse transcriptase (Invitrogen), 1.0 µL of 0.1 M dithiothreitol, 2.5 µL of 10 mM dNTP, and 1.0 µL of RNase OUT (Invitrogen) were added to the mixture.First-strand cDNA was synthesized by incubating the mixture at 40°C for 90 min in the presence of library-speci c adaptors.Synthesis was terminated by heating the mixture at 70°C for 15 min.RNases I (100 U/µL; Invitrogen) and H (2 U/µL; Invitrogen) were subsequently added and incubated with the mixture at 37°C for 30 min to hydrolyze the remaining single-stranded RNA molecules and ensure that only single-stranded cDNA remained.RNase activities were terminated by heating the samples at 70°C for 20 min.Following puri cation with solidphase reversible immobilization (SPRI) beads, second-strand cDNA was synthesized from rst-strand cDNA by asymmetric PCR.In addition to the cDNA, the 50 µL PCR reaction contained 1.0 µL of Phusion Hi-Fidelity DNA polymerase, 10.0 µL of 5X HF buffer, 1.0 µL of 0.4 µM barcode primer, 1.0 µL of 0.8 µM common primer, 1.0 µL of 10 mM dNTP, and nuclease-free water.The PCR reaction was carried out by heating at 95°C for 30 sec, followed by 20 cycles of 98°C for 10 s, 50°C for 30 s, and 72°C for 30 s, with a nal elongation step at 72°C for 10 min.SPRI beads were used to purify and select 200-500 bp fragments from the nal library.After quality control analyses, the size-selected library was sequenced with an Ion PGM Sequencer at the WSU Genomics Core Laboratory.

Data analysis
Raw read processing.Raw data were obtained from 55 samples and stored in FASTQ format.We ltered raw reads with the FASTQ quality lter in the FASTX Toolkit (v0.0.13), allowing for a minimum score of ≥ 10 for ≥ 50% of bases (http://hannonlab.cshl.edu/fastx_toolkit/).We trimmed T nucleotides or T-rich sequences located at the 5' ends of the reads using Perl scripts, as described in a previous study 26 .
Trimmed reads of at least 16 bp in length were kept for further analysis.
Read mapping and poly(A) site clustering.For each data set, we aligned the processed reads to the Rattus norvegicus genome (mRatBN7.2/rn7)using the torrent mapping program (TMAP, v3.4.1; http://github.com/iontorrent/tmap) with the unique best hits parameter (-a 0).We then extracted raw PASs supported by the uniquely mapped reads from SAM les and obtained a polyadenylation tag (PAT) le using an in-house script.After that, we merged all the PAT les to determine the nal PASs for all samples.PASs within 25 nucleotides of one another were grouped into one polyadenylation site cluster (PAC) using GetPolyaSiteCluster 33 .PACs with few reads were rejected according to different criteria depending on library size.For libraries that had less than 1.7M reads, clusters were retained only if a group of 5 biological replicates had at least 3 samples with ≥ 3 reads.For libraries with more than 1.7M reads, at least 3 samples with ≥ 4 reads were required.
Gene annotation and usage of poly(A) sites.We annotated all the nal PACs for PAS_ID, gene symbol, functional region, and other factors, as indicated, using Cuffcompare (v2.2.1) 34 , Perl scripts, and annotation le (GCF_000001895.5_Rnor_6.0_genomic.gtf;https://ftp.ncbi.nlm.nih.gov/).Clusters that mapped to mitochondrial genes were removed, then the number of PAS-covered reads was normalized 35 to the total number of covered reads within each library and rescaled by a factor of 10 7 .
Circadian/ultradian PAS discovery.Using unnormalized PAS read counts as input, rhythmic patterns were identi ed using the MetaCycle 36 R package meta2d, which synthesizes the results of three cycle analysis algorithms (ARSER, JTK_Cycle, and Lomb-Scargle).The analysis was run 5 times with different replicates inserted into each of the appropriate ZT time slots, and the average p-value, FDR, phase, and other statistics were calculated.Only the highly corroborated PASs that were signi cant (p < 0.05) in all 5 trials were used for all analyses.Plots of read counts use normalized reads per 10 7 and show the SEM of 5 biological replicates.
Detailed mapping of APA sites.The data supporting all gures depicting APA sites was from rat genome build BN7.2 and the UCSC (http://genome.ucsc.edu)and RDG (https://rgd.mcw.edu/rgdweb/homepage/)genome browsers 37,38 .)Gene ontology and pathway analysis.Gene over-representation analysis was performed with the webbased tool WebGestalt 39 .Input gene symbol sets representing genes with cycling APA sites (p < 0.05 in 5 of 5 trials and > 1 PAS) or APA sites that were differentially expressed with sleep pressure (p < 0.01, log2FC > 0.5 and > 1 PAS), were compared to relevantly annotated rat genes using an output threshold of FDR ≤ 0.05.For phase-speci c analysis, a sliding window of 5 h centered on each sample collection timepoint was used.For example, for phase ZT6, all PASs with average phase calculations that ranged from 3.5 to 8.5 were grouped.
Differential expression analysis of sleep deprivation/recovery.To evaluate the expression of PASs in sleep homeostasis experiments, PAS counts from rats recovering from 6 h SD were contrasted with timematched controls (R0 vs ZT6, R2 vs ZT8, R4 vs ZT10, R8 vs ZT14).We removed high variation from the rst principal component systematically, resulting in improved variance estimates for low read counts.Prcomp (in R) was used to perform principal component analysis (PCA) and to nd eigenvectors by way of singular value decomposition.DESeq-2 with "Apeglm" Shrinkage 40 and the Wald Test were used to generate test statistics in R software.The FDRtool was used to determine the Local FDR.

RESULTS
Identi cation of PASs in the rat forebrain.In order to identify changes in PAS usage, replicate circadian samples (central forebrains) were taken from ve rats every four hours starting at two hours after lights on (i.e., ZT2, ZT6, ZT10, ZT14, ZT18 and ZT22) (Fig. 1a, b).RNA was puri ed from these samples and used to generate WTTS-seq cDNA libraries that were subsequently sequenced.Poly(A)-directed sequence reads were then mapped to the rat genome, giving rise to 31,796 PAS clusters (see Supplementary Table S1).All mitochondrial PASs were removed since some of them constituted a large fraction of the total reads.Among the 31,757 non-mitochondrial PAS clusters identi ed, a sizable portion mapped to unannotated or intergenic regions, leaving 26,635 PASs that mapped to named loci (i.e., genes).Many APAs occur at different points within the longest 3' UTR (Fig. 1c, sites 4 and 5).Some are distal to the longest documented 3' UTR (site 6), while some occur in internal exons (site 1) or introns (sites 2 and 3) (Fig. 1c).In our data set of all PASs that mapped to genes, 45% mapped to genes with ≥2 APA sites, and 19% mapped to genes with ≥3 APA sites (Fig. 1d).The average number of APA sites per gene was 1.9.
Identi cation of PASs that exhibit a daily cycle.Periodicity of PAS expression was assessed using meta2 36 .Circadian (24 h period) oscillations were demonstrated for 2,011 PASs.Among these, 1,173 were in genes with ≥2 total APA sites, including ones in known diurnal transcripts, such as Dbp (circadian in 2 of 2 APA sites recovered), Nr1d2 (in 1 of 1), Per2 (in 2 of 2), and Ntrk2 (in 2 of 10) 41 (Table 1 and Supplementary Table S2).
To look for functions or cell components that are particularly affected by APA site usage in a time-of-day dependent manner, we performed pathway and gene ontology (GO) over-representation analyses using the online tool WebGestalt 39 .The set of 1,173 gene symbols corresponding to circadian PASs in genes with ≥2 APAs were input (Table 2 and Supplementary Table S3).Glutamatergic Synapse, Membrane Tra cking and Circadian Entrainment are among the enriched terms.
We were interested whether circadian APAs might cluster predominantly into certain phases of peak expression, and whether APAs that share a common peak phase might also share some functional relationship.When circadian APAs from genes with ≥2 total APAs were grouped by phase, it was evident that some phases had very few APAs relative to other phases and the expression levels of many APAs peaked around ZT20 (Supplementary Fig. S1).GO and pathway analysis on each group and found that only phases 2, 10 and 18 had signi cantly over-represented terms.Phase 18 had the most, with the overrepresentation of multiple signaling pathways, including 'neuron to neuron synapse' and 'post-synaptic specialization' (Supplementary Table S4).
The realization that rhythms shorter than 24h are biologically important is growing [42][43][44][45][46] .Thus, we evaluated the PASs data for ultradian cycling using meta2d with the period set to 12 h.Overall, 1502 PASs that cycled with a 12 h period were identi ed (Supplementary Table S5).Of the 12 h cycling PASs, 1,198 were in genes, and after adjusting for genes with multiple 12h cycling APAs, there were 1,149 unique genes in the set.In total, 827 of the 12 h cycling APA sites were in genes that had ≥2 APAs, representing 778 unique genes.Pathway analysis on this set of 778 unique genes (Supplementary Table S6) showed that CREB phosphorylation and circadian entrainment were highly enriched, while GO analysis of this data set resulted in 16 GO terms related to the synapse.
PASs are differentially expressed after sleep deprivation and during recovery sleep.To investigate changes in APA site usage related to sleep pressure, rats were subjected to SD for 6 h from ZT0 to ZT6, and central forebrain tissue was collected immediately afterwards (R0).Additional animals were allowed to recover for 2, 4, or 8 h after SD (R2, R4 and R8) before tissue was collected.WTTS-seq data from these samples were compared to time-matched controls that were allowed to sleep undisturbed (ZT6, ZT8, ZT10 and ZT14).All groups consisted of 5 biological replicates.Our sequencing data showed that the most signi cant differences in expression were seen when we compared R0 with its control (ZT6) and R4 with its control (ZT10) (Supplementary Table S7 and Fig. 2).Interestingly, a Homer1a APA isoform is the most abundant at ZT6, ZT10 and R0, whereas a full-length isoform is dominant at ZT10 (Supplementary Fig. S2 a and b) Also, the expression of one APA isoform of Prmt1, was upregulated with high con dence in the ZT6 vs R0 (Fig. 2).PRMT1 protein regulates multiple stress response pathways 47,48 .
The gene names of differentially expressed APA sites from genes with ≥2 APAs were used for GO and pathway over-representation analysis (Table 3).ZT6 vs R0 only had signi cant results for GO while ZT10 vs R4 had signi cant GO and pathway results.
Comparison of APA-linked brain disorder susceptibility genes with WTTS-seq identi ed circadian APAs and APAs differentially expressed with sleep pressure.A recent survey by Cui et al. 30 using APA transcriptome-wide association studies (TWAS) highlighted the importance of APA site usage in brain disorders.To establish the extent to which genes with APA-linked neurological phenotypes had circadian or sleep related changes in rats, our list of circadian genes with ≥2 APA sites was compared to those reported in Cui et al. 30 .There were 25 overlapping genes (representing 28 APAs in our data, since three genes had 2 circadian APA sites).Another 19 genes with WTTS-seq-identi ed APA sites that cycle on a 12 h period were identi ed in the TWAS data set, as were nine genes (11 APA sites) that were differentially expressed with sleep pressure.Altogether, 54 APAs representing 46 genes were observed in common with genes having disease-associated APAs (Table 4).

Discussion
APA site usage is an understudied aspect of gene regulation.Although APA sequencing can reveal changes in overall gene expression, it's designed to focus on changes in APA usage and cannot reveal differences in splicing or transcription start sites (TSSs).On the other hand, bulk RNA-seq analysis often ignores APA, TSS and splice isoforms to simply assess reads per gene.Currently it would be very di cult to enumerate copies of all the mRNA isoforms for each gene.Yet appreciation is growing for the importance of APA sites in regulating mRNA stability 16,49 , mRNA/protein localization 19,50,51 , and human disease 30,52 .
Rhythmic APA site usage has been uncovered in the mouse liver 21,22,53 , and in temperatureentrained cultured cells, circadian APA usage occurs in many genes and can regulate expression of speci c central clock genes 23 .Still, alternative poly(A) site usage hasn't been given enough attention.We therefore initiated this investigation into the conjunction of APA with sleep and circadian expression.As far as we are aware, the current study is the rst to examine APA sites related to circadian rhythms and sleep pressure in any mammalian brain.In our analysis, we found 5,122 PASs and 318 circadian PASs that mapped outside of known genes, and many APAs within genes mapped to regions in which 3' ends have yet to be annotated.Based on prior WTTS-seq data sets and other PAS mapping approaches, we expect that some portion of our PASs will be method-based artifacts 26,54 , but, overall, the newly discovered PASs should add valuable insights into regulation of the rat transcriptome and for characterizing PAS usage in the mammalian brain.There are several, diverse ways in which data from this study can translate into biological relevance as described in the examples below.
Here, we observed that 6% of all PASs cycled with a 24 h period.One of the top pathways identi ed for the circadian APA gene set was 'circadian entrainment' (Table 2).Since transcription-translation feedback loops are central to circadian regulation, this may not be surprising, but APA site usage may well have a role 23,53 .For example, we nd that one Sin3b APA is circadian (Fig 3a, b).Sin3b encodes short and long variants conserved in mammals.The short variant binds to CRY1 but cannot bind HDAC1 55 .
The long isoform is implicated in regulation of Per1/Per2 transcription 56 , along with many other genes 57 .In our data, long Sin3b APA reads constitute the predominant isoform at ZT6 and ZT22, while the short, circadian isoform is the most abundant one at ZT10, ZT14 and perhaps ZT2 (Fig 3b).Sin3b transcript levels in mouse hippocampus have previously been reported to be affected by sleep deprivation 58 , although this effect was not observed using TRAP-seq 59 , suggesting post-transcriptional processing can lead to changes in sleep-dependent differential expression.Together with our work, this example highlights the importance of utilizing various -omic approaches to properly decipher the complexity of molecular processing tied to changes in behavioral state in the brain.
Additional signi cant pathways emerged from the circadian APAs, such as Oxytocin, Ephrin, and MAPK signaling that have demonstrated links to the circadian clock [60][61][62] .In the GO analysis of the circadian genes with multiple PASs, we discovered that terms related to the synapse (12), protein localization (6), and vesicles (7) (Table 2 and Supplementary Table S3) were enriched suggesting APAs are poised to affect neural communication.
A large proportion of circadian APAs had expression peaks around ZT20 (Supplementary Fig. S1).Considering that rats are nocturnal, this is similar to what has been seen for bulk transcripts in several human tissues, including brain 63 .Interestingly,among the identi ed circadian APA sites, 3 were in genes for RNA-binding proteins (Celf2, Elavl3, and Rbfox1) whose expressions correlate with more distal APA usage 47 .Peak expression of these three genes is from ZT21 to ZT1, so it would be interesting to see if transcripts of predicted targets tend to be longer at these times.
In addition to the 24 h circadian rhythm, recent studies have also demonstrated the existence of cellautonomous ultradian clocks that run independently of the circadian clock to regulate 12 h oscillations in gene expression and metabolism [42][43][44][45][46] .Here we found that 5% of all PASs cycle with a 12 h period.Further analysis of these genes showed enrichment of gene ontology terms and pathways such as "regulation of trans-synaptic signaling" and "protein-protein interactions at synapses" (Supplementary Table S6), indicating that APAs could function to regulate cyclic actions of cell signaling and communication.
Gene expression studies following changes in sleep homeostasis have largely ignored alternative polyadenylation.Of the 31,795 total PASs characterized in rat forebrain in our study, we determined that 2.5% were differentially expressed with sleep deprivation and recovery sleep.We also observed 6 GO terms signi cantly enriched following 6 hours of sleep loss and 26 following 4 hours of recovery sleep (Table 3).
Human APA isoforms have been linked to many neurological disorders 30 .Among the genes that we identi ed to have rhythmic expression of APA sites or had APA sites that were affected by sleep pressure, we found that 46 have also been correlated with brain disorder susceptibility (Table 4).For example, the human MAPT/TAU gene produces transcripts containing short or long 3' UTRs, and a 3' SNP is associated with both 3' UTR length and risks for 8 neurological disorders, including Alzheimer's and Parkinson's diseases 30 .Homozygosity of the more common SNP variant is associated with short MAPT 3' UTRs, homozygosity of the less common SNP variant is associated with long 3' UTRs, and heterozygosity is associated with 3' UTRs of intermediate lengths.In our rat APA data, there were both short and long 3' UTR forms (5 in total) of the Mapt gene that were identi ed (Fig. 3c, d).Only two are currently annotated in the rat genome and one of the newly discovered APAs was observed to cycle with time-of-day.In mouse, binding of the ALS-associated protein TDP-43 to two sites in the 3' UTR of Mapt has been shown to destabilize the mRNA 64 .In Alzheimer's disease, the expression level of TDP-43 protein is often low, and TAU is overexpressed and eventually forms neuro brillary tangles.The two TDP-43 binding sites that were experimentally determined in mouse are conserved in sequence and position in the rat gene, implying that transcripts with shorter 3' UTRs would not be affected by TDP-43, while longer ones could be destabilized 64,65 .The presence of at least one putative TDP-43 binding site in the human MAPT 3'UTR suggests that this may be contributing to the neurological disorder risk.
Ntrk2 is among the APA TWAS genes linked to anxiety 30 and has been associated with autism in other studies 66 .We found strong circadian oscillations of the 2 most abundant APA sites of the short, tyrosine kinase de cient (TK-) Ntrk2 isoform.The TK-isoform of Ntrk2 has several known functions, including a dominant negative effect on the full-length TK+ isoform during neuronal proliferation, differentiation, and survival.In addition, the TK-version promotes lopodia and neurite outgrowth; sequesters, translocates, and presents BNDF; and affects calcium signaling and cytoskeletal modi cations in glia 67 .Our WTTSseq data revealed short, medium, and long 3' UTRs in the rat Ntrk2 TK-isoform (Fig. 3e).In mice, the longer Ntrk2 TK-transcripts are preferentially targeted to apical dendrites 68 .Since the sequence of the rat 3' UTR is highly conserved with the mouse sequence, it is plausible that an analogous dendritic localization mechanism is also in use in the rat (Fig. 2c).Interestingly, 'Ntrk signaling' was one of the pathways over-represented in the circadian APA genes (Supplementary Table S3).APA sites in Src, Frs2, Atf1, Nras, Sh3gl2, Ntrk3, Mapk1, Grb2, Pik3r1, and Mapk14 contributed to this enrichment.
Four different APAs from the Sorl1 gene exhibited signi cant changes in our analyses; two circadian, one cycled with a 12 h period, and one was reduced during recovery from sleep deprivation (Fig. 4).In total, there were seven APAs in the Sorl1 3'UTR, three short, one medium and two long.The longest and most abundant isoform cycles per 12 h, the second longest and medium ones are circadian and the shortest isoform is differentially expressed after SD (Fig. 4).The mouse and human 3' UTRs share extensive similarities including 5 APAs in mouse and 3 in human based on the PolyA_DB v3 (https://exon.apps.wistar.org/polya_db/v2/ ) and UCSC database 37 .Four microRNA binding sites with high probability of preferential conservation are in good alignment (TargetScanHuman v8.0) 69 .The rst motif can be bound by ve miRNAs (miR-25-3p, miR-32-5p, miR-92-3p, miR-363-3p, and miR-367-3p), while the second contains overlapping 7mer and 8mer motifs bound by miR-128-3p and miR-27-3p, respectively.The nal two more distal sites are recognized by miR-153-3p and mir-137 (Fig. 4a).Sequences matching the consensus binding site for CPEB are present in the 3' UTRs of all three species, with 2 in very good alignment.Cytoplasmic polyadenylation element binding protein (CPEB) facilitates mRNA tra cking to synapses and local translation 70,71 , and we have previously shown that the core clock-controlled Fabp7 mRNA 72,73 contains functional CPE sites in its 3'UTR to regulate translation 74 .Since APOE4, an apolipoprotein E variant with increased risk of AD 75 , disrupts FABP7 interaction with sortilin, (an APOE receptor similar to Sorl1), to interfere with neuroprotective lipid signaling 76 , this suggests circadian variation in local translation of CPEB-mediated polyadenylation of target mRNAs may be a generalizable mechanism that modulates AD susceptibility through downstream lipid pathways.Any one or more of these conserved features could lead to conserved functional consequences dependent on APA choice.SORL1 encodes an endosomal recycling receptor 77 .Many polymorphisms or a de ciency of the gene are strong risk factors for AD 78,79 .
Although the current data are correlational in nature, they leverage a call to action for additional work to elucidate the core mechanisms of PAS usage in the brain and to examine the capacity of APA to affect the transcriptomes and proteomes that regulate central brain processes known to be altered by time-ofday and sleep/wake homeostasis.Moreover, it known that PAS usage varies across brain region and cell type 20 (i.e., substructure-, circuit-, laminar-or nucleus-speci c) 80 .These hypothesis-generating data provide an impetus for continued research aimed at delineating how sleep and circadian rhythms impact mental health and neurodegenerative disease.

Declarations
Tables Tables 1 to 4 are available in the Supplementary Files section Figures Figure 1 Schema of the brain region sampled, the collection time/condition, and a plot of the number of APA sites per gene.(a) The region of the central rat forebrain that was collected and used for RNA extraction is labeled 'forebrain'.(b) For sleep homeostasis experiments, rats were sleep-deprived for 6 h and allowed to recover for 0 to 8 h before tissue extraction.Three of the time-matched controls (no SD) were shared with the circadian experiment and one additional time point (no SD at ZT8), was not in common.For the circadian analysis, samples were taken at 4 h intervals from ZT2 until ZT22.Five biological replicates were used for all data points.(c) A diagram of a generic gene shows different types of APAs: within an internal exon (1); within an early intron (2); following an internal exon (3); within the longest documented 3' UTR (4); at the terminus of the longest documented 3' UTR (5); and distal to longest documented 3' UTR.(d) WTTS-seq PAS results; the number of genes on the x-axis (log10 scale) are plotted against the number of APA sites per gene.APA usage correlates with several brain disorders.RNA-seq coverage from individuals homozygous for the less common SNP allele that is associated with longer transcripts (adapted from Cui. et al. 27 ) is shown above the human MAPT 3' UTR map.Binding sites for TDP-43 (indicated by red arrows) that were experimentally determined in mouse align with putative sites in the rat gene, and one possible TDP-43 binding site is indicated in the human 3'UTR.The signi cantly circadian APA is marked with an asterisk.Blocks of homologous sequence between the rat and human genes that were found by BLAST

Figure 3 The
Figure 3 search are indicated by purple bars.The 3' UTR lengths are 4,380, 4,119 and 3,946 n.t. for human, mouse, and rat, respectively.(d) The average normalized read counts ±SE (y-axis) of the short Mapt isoforms lacking TDP binding sites (1+2) and the sum of the three longer isoforms (3+4+5) plotted against time-of-day (x-axis) are shown.(e) The 3' UTR of tyrosine kinase-de cient (TK-) isoforms of the human, mouse, and rat Ntrk2 TK-genes are shown.Arrows indicate the positions of APA sites.The depicted rat APAs are from this current dataset.Circadian rat APAs are indicated with asterisks.The 3' UTR lengths are 5,125, 5,008 and 8,004 n.t. for human, mouse, and rat, respectively.Mouse and rat sequence comparison by BLAST produced 4 segments having 91%, 83%, 86% and 82% identity for regions 1, 2, 3 and 4, depicted by blue bars.