APA site usage is an understudied aspect of gene regulation. Although APA sequencing can reveal changes in overall gene expression, it's designed to focus on changes in APA usage and cannot reveal differences in splicing or transcription start sites (TSSs). On the other hand, bulk RNA-seq analysis often ignores APA, TSS and splice isoforms to simply assess reads per gene. Currently it would be very difficult to enumerate copies of all the mRNA isoforms for each gene. Yet appreciation is growing for the importance of APA sites in regulating mRNA stability16,49, mRNA/protein localization19,50,51, and human disease30,52.
Rhythmic APA site usage has been uncovered in the mouse liver21,22,53, and in temperature-entrained cultured cells, circadian APA usage occurs in many genes and can regulate expression of specific central clock genes23. Still, alternative poly(A) site usage hasn't been given enough attention. We therefore initiated this investigation into the conjunction of APA with sleep and circadian expression. As far as we are aware, the current study is the first to examine APA sites related to circadian rhythms and sleep pressure in any mammalian brain. In our analysis, we found 5,122 PASs and 318 circadian PASs that mapped outside of known genes, and many APAs within genes mapped to regions in which 3' ends have yet to be annotated. Based on prior WTTS-seq data sets and other PAS mapping approaches, we expect that some portion of our PASs will be method-based artifacts26,54, but, overall, the newly discovered PASs should add valuable insights into regulation of the rat transcriptome and for characterizing PAS usage in the mammalian brain. There are several, diverse ways in which data from this study can translate into biological relevance as described in the examples below.
Here, we observed that 6% of all PASs cycled with a 24 h period. One of the top pathways identified for the circadian APA gene set was 'circadian entrainment' (Table 2). Since transcription-translation feedback loops are central to circadian regulation, this may not be surprising, but APA site usage may well have a role23,53. For example, we find that one Sin3b APA is circadian (Fig 3a, b). Sin3b encodes short and long variants conserved in mammals. The short variant binds to CRY1 but cannot bind HDAC155. The long isoform is implicated in regulation of Per1/Per2 transcription56, along with many other genes57. In our data, long Sin3b APA reads constitute the predominant isoform at ZT6 and ZT22, while the short, circadian isoform is the most abundant one at ZT10, ZT14 and perhaps ZT2 (Fig 3b). Sin3b transcript levels in mouse hippocampus have previously been reported to be affected by sleep deprivation58, although this effect was not observed using TRAP-seq59, suggesting post-transcriptional processing can lead to changes in sleep-dependent differential expression. Together with our work, this example highlights the importance of utilizing various -omic approaches to properly decipher the complexity of molecular processing tied to changes in behavioral state in the brain.
Additional significant pathways emerged from the circadian APAs, such as Oxytocin, Ephrin, and MAPK signaling that have demonstrated links to the circadian clock60-62. In the GO analysis of the circadian genes with multiple PASs, we discovered that terms related to the synapse (12), protein localization (6), and vesicles (7) (Table 2 and Supplementary Table S3) were enriched suggesting APAs are poised to affect neural communication.
A large proportion of circadian APAs had expression peaks around ZT20 (Supplementary Fig. S1). Considering that rats are nocturnal, this is similar to what has been seen for bulk transcripts in several human tissues, including brain63. Interestingly,among the identified circadian APA sites, 3 were in genes for RNA-binding proteins (Celf2, Elavl3, and Rbfox1) whose expressions correlate with more distal APA usage47. Peak expression of these three genes is from ZT21 to ZT1, so it would be interesting to see if transcripts of predicted targets tend to be longer at these times.
In addition to the 24 h circadian rhythm, recent studies have also demonstrated the existence of cell-autonomous ultradian clocks that run independently of the circadian clock to regulate 12 h oscillations in gene expression and metabolism42-46. Here we found that 5% of all PASs cycle with a 12 h period. Further analysis of these genes showed enrichment of gene ontology terms and pathways such as "regulation of trans-synaptic signaling" and "protein-protein interactions at synapses" (Supplementary Table S6), indicating that APAs could function to regulate cyclic actions of cell signaling and communication.
Gene expression studies following changes in sleep homeostasis have largely ignored alternative polyadenylation. Of the 31,795 total PASs characterized in rat forebrain in our study, we determined that 2.5% were differentially expressed with sleep deprivation and recovery sleep. We also observed 6 GO terms significantly enriched following 6 hours of sleep loss and 26 following 4 hours of recovery sleep (Table 3).
Human APA isoforms have been linked to many neurological disorders30. Among the genes that we identified to have rhythmic expression of APA sites or had APA sites that were affected by sleep pressure, we found that 46 have also been correlated with brain disorder susceptibility (Table 4). For example, the human MAPT/TAU gene produces transcripts containing short or long 3' UTRs, and a 3' SNP is associated with both 3' UTR length and risks for 8 neurological disorders, including Alzheimer's and Parkinson's diseases30. Homozygosity of the more common SNP variant is associated with short MAPT 3' UTRs, homozygosity of the less common SNP variant is associated with long 3' UTRs, and heterozygosity is associated with 3' UTRs of intermediate lengths. In our rat APA data, there were both short and long 3' UTR forms (5 in total) of the Mapt gene that were identified (Fig. 3c, d). Only two are currently annotated in the rat genome and one of the newly discovered APAs was observed to cycle with time-of-day. In mouse, binding of the ALS-associated protein TDP-43 to two sites in the 3' UTR of Mapt has been shown to destabilize the mRNA64. In Alzheimer’s disease, the expression level of TDP-43 protein is often low, and TAU is overexpressed and eventually forms neurofibrillary tangles. The two TDP-43 binding sites that were experimentally determined in mouse are conserved in sequence and position in the rat gene, implying that transcripts with shorter 3' UTRs would not be affected by TDP-43, while longer ones could be destabilized64,65. The presence of at least one putative TDP-43 binding site in the human MAPT 3'UTR suggests that this may be contributing to the neurological disorder risk.
Ntrk2 is among the APA TWAS genes linked to anxiety30 and has been associated with autism in other studies66. We found strong circadian oscillations of the 2 most abundant APA sites of the short, tyrosine kinase deficient (TK-) Ntrk2 isoform. The TK- isoform of Ntrk2 has several known functions, including a dominant negative effect on the full-length TK+ isoform during neuronal proliferation, differentiation, and survival. In addition, the TK- version promotes filopodia and neurite outgrowth; sequesters, translocates, and presents BNDF; and affects calcium signaling and cytoskeletal modifications in glia67. Our WTTS-seq data revealed short, medium, and long 3' UTRs in the rat Ntrk2 TK- isoform (Fig. 3e). In mice, the longer Ntrk2 TK- transcripts are preferentially targeted to apical dendrites68. Since the sequence of the rat 3' UTR is highly conserved with the mouse sequence, it is plausible that an analogous dendritic localization mechanism is also in use in the rat (Fig. 2c). Interestingly, 'Ntrk signaling' was one of the pathways over-represented in the circadian APA genes (Supplementary Table S3). APA sites in Src, Frs2, Atf1, Nras, Sh3gl2, Ntrk3, Mapk1, Grb2, Pik3r1, and Mapk14 contributed to this enrichment.
Four different APAs from the Sorl1 gene exhibited significant changes in our analyses; two circadian, one cycled with a 12 h period, and one was reduced during recovery from sleep deprivation (Fig. 4). In total, there were seven APAs in the Sorl1 3'UTR, three short, one medium and two long. The longest and most abundant isoform cycles per 12 h, the second longest and medium ones are circadian and the shortest isoform is differentially expressed after SD (Fig. 4). The mouse and human 3' UTRs share extensive similarities including 5 APAs in mouse and 3 in human based on the PolyA_DB v3 (https://exon.apps.wistar.org/polya_db/v2/ ) and UCSC database37. Four microRNA binding sites with high probability of preferential conservation are in good alignment (TargetScanHuman v8.0)69. The first motif can be bound by five miRNAs (miR-25-3p, miR-32-5p, miR-92-3p, miR-363-3p, and miR-367-3p), while the second contains overlapping 7mer and 8mer motifs bound by miR-128-3p and miR-27-3p, respectively. The final two more distal sites are recognized by miR-153-3p and mir-137 (Fig. 4a). Sequences matching the consensus binding site for CPEB are present in the 3' UTRs of all three species, with 2 in very good alignment. Cytoplasmic polyadenylation element binding protein (CPEB) facilitates mRNA trafficking to synapses and local translation70,71, and we have previously shown that the core clock-controlled Fabp7 mRNA72,73 contains functional CPE sites in its 3'UTR to regulate translation74. Since APOE4, an apolipoprotein E variant with increased risk of AD75, disrupts FABP7 interaction with sortilin, (an APOE receptor similar to Sorl1), to interfere with neuroprotective lipid signaling76, this suggests circadian variation in local translation of CPEB-mediated polyadenylation of target mRNAs may be a generalizable mechanism that modulates AD susceptibility through downstream lipid pathways. Any
one or more of these conserved features could lead to conserved functional consequences dependent on APA choice. SORL1 encodes an endosomal recycling receptor77. Many polymorphisms or a deficiency of the gene are strong risk factors for AD78,79.
Although the current data are correlational in nature, they leverage a call to action for additional work to elucidate the core mechanisms of PAS usage in the brain and to examine the capacity of APA to affect the transcriptomes and proteomes that regulate central brain processes known to be altered by time-of-day and sleep/wake homeostasis. Moreover, it known that PAS usage varies across brain region and cell type20 (i.e., substructure-, circuit-, laminar- or nucleus-specific)80. These hypothesis-generating data provide an impetus for continued research aimed at delineating how sleep and circadian rhythms impact mental health and neurodegenerative disease.