Dysregulated Spliceosome Gene Expression May Be a Common Process in Brains of Neurological and Psychiatric Disorders


 Alternative splicing (AS) contributes to the increased cellular and functional tissue complexity that is substantial in the brain. AS is tightly regulated because it is critical to many biological processes. Defective splicing is observed in several neurological and psychiatric disorders. While exonic mutations usually affect the splicing of an individual RNA, mutations in the splicing factors (components of spliceosome) frequently produce widespread disruption in the processing of many precursor-mRNAs. Thus, we tested the hypotheses that expression changes of spliceosome genes may be a common process and shared splicing pathways may be involved in complex polygenic brain disorders. We searched for expression changes of spliceosome-related genes (SGs) using a transcriptome database of several brain regions in 6 neurological and psychiatric disorders, namely Alzheimer’s disease, and autism spectrum, bipolar and major depressive disorder, Parkinson’s disease, and schizophrenia. Out of 255 SGs detected in brain, 138 showed excessive, significant changes in one or more of these disorders. Dysregulation of 10 SGs was shared in 4 disorders, and they were mostly downregulated. Six associated pathways were over-represented in all 6 disorders, including the major and the minor mRNA splicing pathways and RNA metabolism. Therefore, we found that aberrations in the mRNA splicing process may be a common trajectory to many complex brain disorders involving the spliceosome complex.

conducted in brain region for each disorder. A linear regression model in the limma package [41] was chosen to detect the case-control differences.

Catalog of spliceosome-related genes (SGs)
We used a total of 255 genes for analysis of SGs, as previously described [43]. This list combined 158 genes from the major and minor spliceosome family from the HUGO Gene Nomenclature Committee (HGNC) database (https://www.genenames.org/) and 109 core spliceosome component genes [44].
Differentially expressed spliceosome genes (dSGs) We queried the 255 spliceosome genes against the BrainEXP-NPD DEG database and identi ed the signi cantly differentially expressed spliceosome genes (dSGs) (BH adjusted P < 0.05) in each disorder in each brain region for downstream functional annotation and network analyses. Given that the sample size of microarray data is much larger than the RNA-seq data, the downstream analyses used only results from the microarray data.
Functional annotation and network analysis of the dSGs STRING v11.5 (https://string-db.org/) [51] was used to perform Gene Ontology (GO) functional annotation and detect protein-protein interaction (PPI) networks among the signi cant dSGs.

GWAS signals and expression quantitative trait loci (eQTL) related to dSGs
The signi cant dSGs were also searched against genome-wide association studies (GWAS) of about 25,025 genes from the 6 disorders' latest public GWAS summary statistics [52][53][54][55][56][57]. A hypergeometric distribution test was used to evaluate the signi cance. We further analyzed whether the dSGs have signi cant brain eQTL SNPs (single nucleotide polymorphisms) that can relate to the GWAS SNPs expression regulation of the dSGs, based on the PsychENCODE eQTL results [58][59][60] Co-expression networks related to dSGs.
We used the co-expression results from PsychENCODE [58] to reveal the genes co-expressed with dSGs and their association with psychiatric disorders. A robust version of weighted gene correlation network analysis (rWGCNA) was conducted on 2160 brain samples, including 1232 control, 593 SCZ, 253 BP, and 82 ASD samples. Network analysis was performed 100 times by resampling 2/3 samples to ensure the robustness of the module. Consensus network analysis was used to de ne nal modules [58]. In total, 34 co-expression modules were identi ed. Disease association test was performed on module eigengene (the rst principal component of the module) and disease trait. LD score regression (s-LDSR) was used to investigate the enrichment of GWAS signal in the co-expression module. Finally, cell type enrichment was performed with cell type-speci c marker genes using the Fisher's exact test.
Transcriptome-wide association analysis (TWAS) analysis for the dSGs.

Differentially expressed spliceosome genes (dSGs) across neurological and psychiatric diseases
A total of 138 dSGs were identi ed as the union of signi cantly differentially expressed spliceosome genes from all the brain regions of all the six diseases (Online Resource 2). The dSGs have a signi cantly excessive presence (P = 8.52E-23) in all the 6 diseases' DEGs. Besides, we found 10 dSGs (FAM50A, HNRNPAB, LSM5, LSM7, PPWD1, SF3A1, SF3B5, SNRPB, SNRPD1, and YBX1) (Fig. 1, Table 2) shared by four disorders among all brain regions, based on the query results (Online Resource 3). The detailed summary statistics of the 10 overlapped genes are shown in Table 2. No dSG was shared by ve or six disorders.
The number of signi cant dSGs varied in different brain regions for each disorder. Some regions did not show any signi cant dSGs. The top 2 brain regions affected the most by aberrant splicing were: hippocampus (n=17) and neocortex (n=9) in AD; cerebellum (n=52) and temporal cortex (n=13) in ASD; frontal cortex (n=47) and cerebellum (n=1) in BD; frontal cortex (n=20) and anterior cingulated cortex (n=3) in MDD; substantia nigra (n=25) and striatum (n=24) in PD; temporal cortex (n=32) and frontal cortex (n=22) in SCZ (Online Resource 3). The frontal cortex is the most affected region across diseases (AD, BD, SCZ, and MDD). This uneven distribution of dSGs across brain regions may provide helpful insights into which brain regions are most disrupted by AS and spared in each disease, which could be further studied in each brain region.
Out of the 138 dSGs, 116 were also dSGs in the RNA-seq replication datasets, with only 5 being in opposite directions in microarray and RNA-seq results (Online Resource 4), which showed robustness of our results.

Functional annotation of dSGs
Functional annotation was performed for the dSGs of each disorder in each brain region in Online Resource 3. The results showed 18 signi cantly enriched (FDR < 0.05) GO terms including spliceosomal complex, ribonucleoprotein complex, RNA binding and regulation of RNA splicing shared by 6 disorders (Fig.   2, Table 3). In addition, all the 8 genes (HNRNPAB, LSM5, LSM7, SF3A1, SF3B5, SNRPB, SNRPD1, and YBX1) (Online Resource 5) were involved the 18 GO terms shared by the 6 disorders were part of the 10 dSGs shared by the 4 disorders.

Reactome analyses of the dSGs
Reactome analyses revealed unique and common over-represented pathways to more than 2 disorders (Fig. 3, Table 4). The following 6 pathways were overrepresented in all 6 disorders: mRNA splicing, major and minor pathway of mRNA splicing, processing of capped intron-containing pre-mRNA, metabolism of RNA, and SLBP (stem-loop binding protein) independent processing of histone pre-mRNAs. Additionally, the two genes (SNRPB and YBX1) (Online Resource 5) shared by the 6 disorders in the 6 pathways were also part of the 10 dSGs.  Table 5). Two protein complexes were over-represented in all the 6 disorders: U2-type spliceosomal complex, and mRNA cis splicing, via spliceosome; U2-type precatalytic spliceosome.
The 3 overlapped matching genes (LSM7, SF3A1, SF3B5) (Online Resource 5) shared by at least 4 disorders in the 2 PPI terms shared by 6 disorders were in the 10 signi cant dSGs.

Excessive GWAS signals around the signi cant dSGs.
The 138 dSGs were compared to the list of GWAS signi cant genes in the latest largest public GWAS summary statistics of the 6 disorders (Online Resource 6) [52][53][54][55][56][57]. Ten dSGs also had SCZ GWAS associations (Online Resource 7). No signi cant dSG was found in the other 5 disorders. According to the brain eQTL data from PsychENCODE [58], 3 of these 10 dSGs had 18 SNPs associated with their gene expressions (Table 6), which were the very same SNPs identi ed in the SCZ GWAS.
6. Co-expression patterns of the 3 signi cant dSGs with both GWAS and eQTL signals Two of three signi cant dSGs were co-expressed with other genes in PsychENCODE co-expression modules [58]. IK was in the M11 module related to RNA processing, spliceosome, and ribonucleoprotein complex functions. M11 was enriched for marker genes of astrocytes (FDR=0.0002). SF3B1 was in the M14 module, which was related to nuclear speck, regulation of stress−activated MAPK cascade, and Wnt−activated signaling pathway involved in forebrain neuron fate commitment. M19 was enriched for GWAS signals of SCZ (FDR=1.75e-05), BD (FDR=0.03), ASD (FDR=0.05) and Years of Education (FDR=9.70e-06).

Signi cant dSGs detected in the brains of neurological and psychiatric diseases
This study analyzed transcriptome data sets from several brain regions in 6 different neurological and psychiatric disorders, namely AD, ASD, BP, MDD, PD, and SCZ, and identi ed signi cant dSGs in all these conditions. No single gene with signi cant dSGs was found in all 6 conditions; however, SGs were enriched in the differentially expressed genes in all disorders. Moreover, 10 dSGs overlapped in 4 disorders, and 9 out of these 10 genes were downregulated in the brain regions we analyzed. Furthermore, 6 pathways were over-represented in all 6 disorders, including the major and minor mRNA splicing pathways and RNA metabolism. Therefore, we found that aberrations in the mRNA splicing process may be a common trajectory to many brain conditions, as it was dysregulated in all queried disorders.
The spliceosome, a macromolecular complex consisting of several proteins and small nuclear (sn) ribonucleoproteins (RNPs), distinguishes speci c sequences in the intron-exon borders to promote splicing. Several splicing activator and repressor proteins attached to enhancers and silencers regulate the spliceosome activity, affecting AS of different pre-mRNAs that share common regulatory elements, resulting in AS patterns [62][63][64]. Pre-mRNA splicing is performed by 2 types of spliceosomes, the major, U2-dependent, and the minor, U12-dependent, that identify and delete U2-and U12-type class of introns, respectively [65]. We found the U2-type (major) spliceosomal complex to be the most shared system. Based on the PPI network, this complex has been connected to dSGs in brains of all 6 diseases analyzed in this study.
Majority of the dSGs are disease speci c indicating the complexity of the splicing regulation and the relationships between spliceosome and each disorder. Even though each of these SGs work in the "same" so-call spliceosome complexes, their individual expression changes lead to distinct downstream effects, including changes of splicing in sets of genes and ultimately various symptoms and disorders. The mechanistic details remain to be uncovered.

Ten overlapping dSGs in neurological and psychiatric conditions
The 10 overlapping dSGs found in 4 studied disorders are associated with pre-mRNA processes, especially pre-mRNA splicing. Seven dSGs are components of the major U2-dependent spliceosome, 2 are splicing factors (SF3A1 and SF3B5), 2 are snRNA Sm-like proteins (LSM5 and LSM7), 2 are snRNP (SNRPB and SNRPD1), and 1 is a DNA binding protein (FAM50A). The paragraphs below brie y summarize each of the 10 dSG.
FAM50A (Family with sequence similarity 50 member A; Chromosome (Chr) Xq28) is a nuclear protein that functions as a DNA-binding protein involved in mRNA processing; it has a role in the major spliceosome C-complex [66], and its allelic variants have been identi ed in males with the Arm eld type of Xlinked syndromic intellectual development disorder [66,67].
HNRNPAB (heterogeneous nuclear ribonucleoprotein A/B; Chr 5q35.3) is associated with pre-mRNAs, and binds to one of the components of the multiprotein editosome complex that performs RNA editing [68].
SF3B5 (Splicing factor 3B subunit 5; Chr 6q24.2), a component of the SF3B complex, is a major spliceosome subunit required for "A" complex assembly shaped by the binding of U2 snRNP to the branchpoint sequence in pre-mRNA [75]. SNRPB (Small nuclear ribonucleoprotein polypeptides B and B1; Chr 20p13) and SNRPD1 (Small nuclear ribonucleoprotein polypeptide D1; Chr 18q11.2) encode nuclear proteins found in U1, U2, U4/U6, and U5 snRNPs, the ve snRNAs in the core of the major spliceosome. SNRPB allelic variants have been described in the cerebrocostomandibular syndrome [76-78]. YBX1 (Y-box binding protein 1; Chr 1p34.2) functions as a DNA and RNA binding protein and has been implicated in many cellular processes, including pre-mRNA splicing and RNA dependent processes [79].
It is estimated that at least 20% of disease-causing mutations affect pre-mRNA splicing [80]. Spliceosomopathies are human diseases caused by mutations in the components of the major and minor spliceosomes, such as retinitis pigmentosa, myelodysplastic syndromes, spinal muscular atrophy, and craniofacial malformations [81][82][83]. Mutations in RNA-binding proteins involved in splicing regulation and disruptions in RNA metabolism, including mRNA splicing, have been associated with diseases, such as ASD [29], age-related disorders (frontotemporal lobar dementia [84], PD [85], and AD [21,86,87]). In AD, it has been suggested that the core splicing machinery may be altered due to the increased aggregation of insoluble U1 snRNP [88]. Raj et al. [21] found ribosomal binding protein (RBP) sites enriched among splicing quantitative trait loci (sQTL). The binding targets for 18 RBPs were among the lead sQTL. Furthermore, sQTL SNPs were signi cantly enriched for several hnRNP, and they were correlated with the intronic excision level of hundreds of genes, including several AD susceptibility loci. Therefore, indicating that altering the sequence-speci c binding a nity of splicing factors can change the probability of a splicing event in vivo.
2. Cross-disease comparisons highlighted genes that contribute to all six brain diseases.
Five overlapped genes (SNRPB, YBX1, LSM7, SF3B5, and SF3A1) either shared by six brain disorders in the 6 pathways or shared by at least four disorders in the 2 PPI terms shared by six disorders were all in the overlapped genes shared by the six disorders in the 18 GO terms and the 10 signi cant dSGs. These genes may hold the key connecting all the seemingly unrelated hundreds of risk genes and their changed splicing patterns in patient brains. Their regulation targets and biological processes should be the foci of future functional studies.

Genetic regulators of spliceosome genes contribute to brain disease risk
Out of the 255 SGs tested, 10 genes were signi cant dSGs and GWAS genes of one of the brain diseases. Three genes have both signi cant GWAS and eQTL signals. There are 18 overlapped SNPs (Table 11) between the GWAS signals from the 10 dSGs and eQTL signals from the 3 dSGs. The 3 genes were signi cantly differentially expressed in ASD, BD, and MDD comparing to healthy controls.
Among the 3 genes, SF3B1 was a signi cantly down-regulated dSG in ASD and MDD (FDR = 0.031, 0.043, respectively) and with a nominally signi cant down-regulation in cerebellum, parietal cortex and striatum of SCZ (P = 0.033, 0.016, 0.045, respectively). It has GWAS signals related to SCZ and brain eQTL signals. SF3B1 encodes subunit 1 of the splicing factor 3b protein complex and is mainly related to the mRNA splicing pathway [89]. The SF3B1 related SNP rs788021 is a very strong risk SNP for cognitive ability, years of educational attainment (both at P_Value = 1.00E-09P) [90], and SCZ (pleiotropy) (P_Value = 5. 92E-14)[57]. Our results indicate a potential mechanism that a SNP may disturb expression of spliceosome gene SF3B1 and lead to downstream changes of splicing of its target genes, and increased risks of psychiatric disorder(s).

Current limitation and future experiments
Our DEG analyses on the spliceosome were performed using available microarray and RNA-seq data. The brain sample size is still relatively small. It is possible that more dSGs will be detected and be shared across disorders when sample size increase. Future studies should focus on functional experiments to validate the relationships between altered expression of spliceosome-related genes and changes of splicing patterns in brains.

Conclusion
In summary, AS regulation in the human brain is distinct and highly complex [91,92], and it may have central roles in brain development and physiological function. We detected the excessive changes of SG expression with both disease-speci c and disease-shared patterns in brains of six neurological and psychiatric disorders. Our data support the notion that dysregulated AS processing, especially involving the major spliceosome, may have a dominant role in these disorders.   Figure 1 Signi cant dSGs in the 6 neuropsychiatric disorders. a. Venn diagram and b. UpSet plot show the disease-speci c and shared dSGs. Venn diagram shows the number of PPI terms over-represented in the 6 disorders using a list of dSGs (FDR<0.05). a. Venn diagram shows the numbers of shared PPI terms. b. UpSet plot shows the numbers of PPI terms shared across disorders.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.