Shared genetic architecture (MiXeR)
MiXeR revealed substantial shared ‘causal’ variants between ASD&INT and ASD&EDU. As shown in the Venn diagram (Fig. 1), the estimated number of shared ‘causal’ variants between ASD and INT was 11.1k (SD=0.7k), with 1.6k (1.2k) unique ASD variants and 0.6k (0.7k) unique INT variants. The Dice coefficient was 0.91 for variants shared between ASD and INT (Table S15). MiXeR estimated 12.0k (1.3k) shared ‘causal’ variants between ASD and EDU, with 0.7k (0.7k) unique ASD variants and 1.7k (1.4k) unique EDU variants. The Dice coefficient was 0.90 for variants shared between ASD and EDU (Table S15). The proportion of shared ‘causal’ variants with concordant effects for ASD&INT was 0.58 (SD=0.004) and 0.58 (SD=0.005) for ASD&EDU.
In the conditional Q–Q plots, we observed SNP enrichment for ASD as a function of the significance of SNP associations with EDU (Fig. 2a) and INT (Fig. 2b). The reverse conditional Q–Q plots also demonstrate consistent enrichment in ASD given associations with INT and EDU, indicating polygenic overlap between the phenotypes (Fig S1a and S1b).
Log likelihood plots are shown in Fig S1a and Fig S1b. The AIC values (Table S15) were positive when comparing modelled estimates to minimum overlap, but negative compared to maximum overlap for both ASD/INT and ASD/EDU analysis. This indicates that the MiXeR-predicted overlap is not distinguishable from maximum possible overlap, suggesting caution in interpreting the estimates from MiXeR. ASD and INT have LDSR-based genome-wide genetic correlation of rg=0.22 (SD=0.032, p= 4.60e-12) and MiXeR-estimated genetic correlation of shared variants of ρβ=0.24 (SD=0.01). For ASD and EDU, those values are respectively rg=0.21 (SD=0.028, p=2.17e-13) and ρβ=0.25 (SD=0.02). This pattern of extensive genetic overlap but weak genetic correlation is indicative of mixed effect directions, supported by the MiXeR-estimated proportion of shared ‘causal’ variants with concordant effects of 0.58 for both ASD&INT and ASD&EDU.
Identification of shared genetic loci (cond/conjFDR)
CondFDR: We leveraged this pleiotropic enrichment using condFDR analysis and re-ranked the ASD SNPs conditional on their association with EDU or INT, and vice versa. At condFDR <0.01, there were 9 loci associated with ASD conditional on INT (Table S1), of which two loci were not found in the original ASD GWAS (Table S1). We identified 12 loci associated with ASD conditional on EDU (Table S2), of which four were not in identified the original ASD GWAS (Table S2).
ConjFDR: The conjFDR Manhattan plots are shown in Fig 3a and 3b. At conjFDR <0.05, we detected 19 genetic loci jointly associated with ASD and INT (Table S3), and among them, 11 are unique for ASD and INT. We detected 32 distinct genetic loci jointly associated with ASD and EDU (Table S4), of which 24 are unique for ASD and EDU. Eight loci were common for both ASD and EDU and ASD and INT, yielding a total of 43 distinct loci at conjFDR <0.05. Of these SNPs, 18 were intronic, 13 intergenic, 11 non-coding RNA intronic and 1 exonic (See Tables S3 and S4).
Evaluation of allelic effect directions: As denoted by the sign of the effect, 68% (13/19) of the shared loci between ASD and INT had concordant allelic effect directions (Table S3) and 59% (19/32) of the shared loci between ASD and EDU possessed concordant allelic effect directions (Table S4).
Novel ASD loci: As seen in table S3, 11 of 19 the lead SNPS jointly associated with ASD and INT at conjFDR <0.05, were not identified in the original ASD GWAS 11, and as seen in table S4, 21 of the 32 loci jointly associated ASD and EDU were also novel. Five of these loci were overlapping both with EDU and INT, which yielded a total of 27 novel ASD loci, which are presented in Table 2.
Functional annotation: We did functional annotation of all SNPs with a conjFDR value <0.1 within loci shared between ASD & INT and ASD & EDU, which included 2356 candidate SNPs jointly associated with ASD and INT and 1782 SNPs candidate SNPs jointly associated with ASD and EDU.
Gene-mapping: By using three different methods (positional, eQTL, and chromatin interaction) we mapped 104 genes from candidate SNPs within loci shared between ASD and INT (see Table S7) and 132 genes for ASD and EDU (see Table S8). Analyses indicated that there were 10 genes for ASD and EDU and 16 genes for ASD and INT which were credible by all three methods.
Gene-set enrichment and molecular function analysis (FUMA)
Gene expression in different tissues: Heatmaps of all genes based on candidate SNPs are shown in Fig S4a (ASD and EDU) and Fig S5a (ASD and INT). As seen in FS4b and FS5b, candidate genes from ASD and EDU had significantly upregulated differentially expressed genes (DEGs) in four of 54 different tissues namely, brain cortex/frontal cortex and brain cerebellum/cerebellum hemispheres (Fig S4b), while and candidate genes from ASD and INT had significant upregulated DEGs two tissues: cerebellum /cerebellar hemisphere.
Gene expression during brain development periods: Candidate genes tended to have upregulated expression during early prenatal period and late infancy (Fig S3c and Fig S4c) but these differences were not significant.
Gene set enrichments: GO biological processes molecular function (tables S9 and S10): Enrichment was found in 43 different gene sets, including positive regulation of central nervous system development, midbrain development, neuronal differentiation, synaptic signaling, neuron death, gliogenesis, astrocyte development, mitochondrion organization, synapse plasticity and more general pathways as inositol phosphate and response to reactive oxygen species,
Transcription factors: Candidate genes were enriched in the pathways of 100 transcription factors, of them HIF1 (hypoxia inducible factor 1), NFR1 (nuclear respiratory factor 1) and vitamin D receptor.
Immunologic signatures: Candidate genes were enrichments in 23 immune related gene sets for ASD and EDU, among them, Interleukin -2 and Interleukin-10 pathways, Macrophage Stimulating 1 (MSP1) pathway, EBNA1 anticorrelated, and development of regulatory T cells (Tregs).
GWAS gene sets: As seen in Table S9 and S10, enrichment was seen in 100 different gene sets including ASD related social behaviors (attendance at social groups, helping behavior, birth), gene sets for cognitive function, mental disorders (short sleep duration, alcohol abuse, mood instability, schizophrenia, depression, neuroticism), intracranial volume, neurologic diseases, inflammatory bowel diseases cardiovascular measures, lung function/pulmonary fibrosis and endocrine measures.
FUMA of concordant loci are shown in Fig S6 – S7 and tables S11 and S13. Tissue expression (fig S6b and S7b) analyses showed that DEGs were significantly different in 13 tissues for ASD and INT, with highest in frontal cortex. Similar results were found concordant genes for ASD and EDU, were DEGs were significantly less expressed in amygdala, hippocampus, basal ganglia, and substantia nigra. Highest upregulation (non-significant) was found in brain frontal cortex and cerebellum (fig S7b). Heatmaps of concordant candidate genes for ASD and EDU, and ASD and INT, are shown in Fig S6c and Fig S7c. The concordant genes were enriched in gene sets for extremely high intelligence, social traits (attending social groups and helping behavior), psychiatric disorders, inflammatory bowel diseases and immunological signatures (Table S11 and S13).
FUMA of discordant loci are shown in Fig S8 – S9 and tables S12 and S14. Analysis of tissue expression showed that discordant genes had significantly upregulated DEGs only in cerebellum/cerebellar hemisphere (Fig S8b and Fig S9b). Heatmaps of discordant candidate genes for ASD and EDU, and ASD and INT, are shown in Fig S8a and Fig S9a. Gene set enrichment analysis showed enrichment in several gene sets, including neurodegenerative diseases (incl. Alzheimer’s disease and Parkinson’s disease), chronic pain, alcohol use disorder and craniofacial macrosomia (small head and face).