Through powerful gene-based analysis, we identified 5 gene regions (KIZ, KIZ-AS1, XRN2, LOC101929229, and SOX7) significantly associated with ASD. Gene SOX7 and LOC101929229 (also known as PINX1-DT) were replicated by a different GWAS data (gene KIZ was close to the boundary of replication (p=0.06)) and advocated by the differential gene expression analysis performed on publicly available RNA-seq data.
KIZ is located on chromosome 20 and encodes Kizuna centrosomal protein, which aids in stabilizing the pericentriolar region of centrosomes before spindle formation. KIZ has been identified as significantly associated with autism in previous GWAS (Grove et al., 2019), TWAS (Huang et al., 2021), gene-based analysis (Alonso-Gonzalez et al., 2019), and methylation-based studies (Hannon et al., 2018), and the involvement of cell cycle regulation in autism susceptibility which has also been implicated in previous research (Packer, 2016; Pramparo et al., 2015). KIZ has also been found to be a potentially shared genetic locus between ASD and attention-deficit hyperactivity disorder (ADHD), providing support for its involvement in neurological conditions (Baranova et al., 2022).
XRN2 is located next to KIZ and encodes a 5’-3’ exonuclease that is involved in myriad RNA management processes, including transcriptional termination, miRNA expression regulation, nonsense-mediated mRNA decay, and rRNA maturation (Brannan et al., 2012; Nagarajan et al., 2013; Wang & Pestov, 2011; West et al., 2004). XRN2 has been found to play a role in regulating miRNA expression in neurons specifically, and altered miRNA expression regulation has been investigated as a potential mechanism for autism susceptibility (Abu-Elneel et al., 2008; Ghahramani Seno et al., 2011; Hicks & Middleton, 2016; Kinjo et al., 2013; Wu et al., 2016). Likewise, disruption of proper RNA metabolism as a result of altered expression of RNA binding proteins has been implicated in neurological disease as a whole, and the XRN gene family is involved in nonsense-mediated decay of mRNA, a process that has been implicated in autism pathophysiology (Marques et al., 2022; Nussbacher et al., 2019). Previous GWAS have reported SNPs in the region containing XRN2 to be significantly associated with ASD, affirmed by gene-based analysis using MAGMA (Grove et al., 2019). Additionally, a transcriptome-wide association study (TWAS) found XRN2 to be significantly upregulated in autism, in accordance with our findings (Pain et al., 2019). Another gene-based analysis found XRN2 to be associated with ASD, and upon further investigation via gene-network analysis and enrichment analysis, not only does XRN2 interact with several genes in the cAMP signaling pathway and RNA transport network, but the enriched KEGG/GO terms for XRN2 (spliceosome, RNA transport, and nucleic acid binding) found to be associated with ASD are also essential processes pivotal to early development (Alonso-Gonzalez et al., 2019). The extensive involvement of XRN2 in such complex mechanisms of gene expression regulation, particularly in neuronal cell types, offers possible insights into the vast heterogeneity of ASD and its overlap with other neurodevelopmental conditions. In fact, more recent research efforts have focused on ascertaining genetic commonalities between ASD and related disorders such as ADHD, obsessive compulsive disorder (OCD), and Tourette syndrome, of which XRN2 seems to be a shared significant locus (Peyre et al., 2020; Yang et al., 2021).
SOX7 is of particular interest due to its hallmark involvement in the regulation of the Wnt/ -catenin pathway, an important developmental signaling pathway. SOX7 and its related SOX family genes encode transcription factors that are critical to the downregulation of the canonical Wnt/ -catenin signaling pathway, which controls embryonic development and adult homeostasis and is involved in a multitude of cellular processes (Katoh, 2002; MacDonald et al., 2009). While the Wnt pathway is ubiquitous to nearly all tissue types, proteins involved in Wnt signaling in the brain specifically have been found to localize in the synapses and influence synaptic growth, and knockout murine models of ASD risk genes that are a part of the Wnt pathway have provided support for the disruption of this pathway in autism-like behaviors (Kwan et al., 2016). Indeed, the Wnt/ -catenin signaling pathway has been suggested as a possible avenue for autism pathogenesis in several studies (Caracci et al., 2021; de la Torre-Ubieta et al., 2016; El Khouri et al., 2021; Hormozdiari et al., 2015; Kwan et al., 2016; Quesnel-Vallières et al., 2019; Vallée et al., 2019).
SOX7 also regulates angiogenesis, vasculogenesis, and endothelial cell development, and the SOX family of transcription factors is critical to cardiovascular development (Francois et al., 2010; Kim et al., 2016). For example, SOX7 was found to be upregulated in sustained hypoxic environments, mediating angiogenesis (Klomp et al., 2020), and a knockout model of SOX7 was found to result in profound vascular defects, demonstrating that SOX7 has an essential role in vasculogenesis and angiogenesis in early development (Lilly et al., 2017). Links between the role of SOX7 in developmental delay and congenital heart disease have been investigated. Specifically, deletions in the region where SOX7 resides have been demonstrated to simultaneously cause congenital heart defects and intellectual disability (Páez et al., 2008; Wat et al., 2009).
Additionally, Wnt signaling has been demonstrated to orchestrate the differentiation of neural vasculature, such as the blood‒brain barrier (Reis & Liebner, 2013; Stenman et al., 2008). Likewise, there is evidence of vascular involvement in the development of autism (Casanova, 2007; Emanuele et al., 2010; Ouellette et al., 2020; Yao et al., 2006). One review in particular suggests that mutations affecting the delicate interactions between Wnt signaling and Shh pathways may alter blood brain barrier integrity in autism by aberrantly interacting with neurovascular molecules (Gozal et al., 2021).
Last, oxidative stress has been researched as a potential source of autism susceptibility (Bjørklund et al., 2020; Chauhan & Chauhan, 2006), and the interaction between altered vasculature and autism during oxidative stress could point to another potential source of pathogenesis (Yao et al., 2006). Indeed, the role of Wnt/ -catenin signaling in oxidative stress has been directly implicated in autism susceptibility (Zhang et al., 2012). This combination of evidence that implicates both Wnt signaling and SOX7 interactions in the multitude of interrelated processes that have been suggested as mechanisms behind the etiology of ASD, supplemented by our findings, provides ever-mounting support for more in-depth investigations of these particular genes and pathways.
Wnt/ -catenin, oxidative stress, and impaired/altered vasculature have all been implicated in the development of ASD. These three factors are involved with each other and multiple systemic processes, which may contribute to ASD symptom heterogeneity. The fact that SOX7 is involved in the regulation of Wnt/ -catenin and vasculogenesis points to a potential converging mechanism behind the pathophysiology of ASD. Additionally, the association of SOX7 with autism has been investigated directly. A case study involving a child patient exhibiting “8p23.1 duplication syndrome” revealed a de novo 1.81 Mbp duplication event on chromosome 8 (8p23.1), spanning the region where SOX7 lies (Weber et al., 2014). This patient exhibited characteristic symptoms of the duplication syndrome, including delay of motor and speech development and intellectual disability, which heavily overlap with autism and related intellectual conditions. Indeed, this patient also exhibited symptoms specific to ASD, such as repetitive compulsive behavior.
A GWAS performed in a Mexican population found that SOX7 was differentially methylated between autism cases and controls (Aspra et al., 2022). Another study also found that differential methylation was associated with an “elevated polygenic burden” for autism and further identified that two significantly associated CpG sites were located near GWAS markers for autism on chromosome 8 in the same region as SOX7 (Hannon et al., 2018). It is worth noting that this study also found evidence of SNPs associated with both autism and DNA methylation that were annotated to KIZ and XRN2, two genes that we also found to be significantly associated with ASD.
Changes in methylation lead to changes in gene expression, providing another plausible mechanism of SOX7 involvement: a change in SOX7 methylation affects the expression and thus availability of the transcription factor it encodes, which has a downstream effect on the subsequent pathways SOX7 regulates, such as Wnt/ -catenin. Indeed, both methylation studies demonstrated a negative difference in methylation between autism cases and controls. Generally, undermethylation results in a less compact 3-dimensional genome structure, allowing for greater access to the gene and an increase in expression, which we see in the higher gene expression counts in autism cases versus controls in our RNA-seq data (Figure 2) (Buitrago et al., 2021; Keshet et al., 1986; Lewis & Bird, 1991).
Finally, altered expression of SOX7 has been shown to play a role in the development of several types of gliomas. One study demonstrated that SOX7 was downregulated in human glioma, allowing cancer development through upregulated Wnt/ -catenin signaling (Zhao et al., 2016), whereas another study demonstrated that overexpression of SOX7 in high-grade glioma (HGG) promoted cancer development by promoting tumor growth via vessel abnormalization (Kim et al., 2018). These somewhat conflicting observations demonstrate that, due to its heavy involvement in regulating several intricately linked developmental and homeostatic functions, SOX7 expression must be delicately balanced. Interestingly, it has also been demonstrated that there is extensive overlap of genetic risk between autism and cancer (Crawley et al., 2016; Crespi, 2011; Gabrielli et al., 2019; Tabarés-Seisdedos & Rubenstein, 2009). SOX7 expression and its interactions may provide additional support for this conjecture, particularly due to its role in vasculature development and Wnt signaling regulation.
Limitations
The methods performed in this study are not without limitations. Gene expression is a very dynamic process that is not only tissue dependent but also cell type specific and varies depending on the developmental stage and even external factors (Fitzgerald et al., 2004; Hsieh et al., 2000; Lawlor et al., 2017; Shen-Orr et al., 2010; Weyer & Schilling, 2003; Xu et al., 2014). Certainly, these factors affecting genetic expression mean that any autism-related genes that are differentially expressed at different developmental stages or in other varying contexts may be missed. Additionally, differential expression analysis was performed on bulk RNA, whereas it is possible that altered gene expression between autism cases and controls is cell-type specific; knowing the specifics of the expression state of specific cell types that make up key areas of the brain has a better chance of revealing mechanisms behind autism pathogenesis as well as possibly elucidating the pathophysiology behind the vast variety of ASD subtypes. Gene-based analysis also has some limitations, the most important being the reliance on a reference population for estimating linkage disequilibrium between variants. The similarity of this reference population to the population of study is crucial to the accuracy of many gene-based analyses, including those performed here. As a result, the extent of our findings is limited to European populations, as this was our reference of choice. Future work includes a tighter integration of DNA and RNA information as well as extensions to non-European populations that have been under-researched.
These limitations notwithstanding, the study has considerable strengths. The AT method used in the gene-based GWAS can not only integrate the favorable properties of sum and squared sum tests but also consider LD information among genetic variants. The heatmap of the correlation between genetic variants in SOX7 (Supplementary Figure 1) indicates that rs7005905 and rs7836366, rs10100209 and rs7836366, and rs10100209 and rs7005905 have strong positive linkage disequilibrium (LD) (ρ>0.5); rs4841432 has negative LD with other variants except for rs7009920. The strong LD in SOX7 and the powerful AT method warrant our identification of the autism-associated gene SOX7. The successful replications of SOX7 in the replication data, gene expression data, and the associated biological plausibility underscores the robustness of the finding of the connection between SOX7 and autism. This finding may significantly advance our understanding of the etiology of autism, open new opportunities to reinvigorate stalling autism drug development and increase the accuracy of risk prediction of autism, which makes early autism intervention and prevention possible.