Clinical parameters of the study subjects.
Case Controlled Cohort. A total of 100 never-medicated children on whom Paxgene stabilized blood was available were used to compose 24 pairs of ADHD affected or unaffected case controls. They were selected so that each ADHD case was matched by age and gender to an unaffected control. From these 24 pairs, 23 ADHD and 21 controls were successfully RNA sequenced to pre-specified criteria of read depth. The characteristics of the groups are shown in Table 1.
Discordant Twins Cohort. An initial cohort of 50 never-medicated, potentially discordant identical twin pairs was identified and then narrowed for clinical criteria focusing on a high degree of discordance in the severity of ADHD symptoms. A set of 24 pairs of twins with strong to moderate discordance were identified and 16 pairs had sufficient RNA quality and yield, and successful RNAseq data for further analysis (Table 1).
Analysis of differentially expressed genes
Case Controlled Cohort.
The average yield of nucleic acids was 5.4 ug/tube with an average 260/280 ratio of 2.3 and a BioAnalyzer RIN RNA quality index of 8.97. The tSMS method produced a very broad profiling of ribosome-depleted RNA transcripts in stabilized whole blood. By averaging all subjects in each group, filtering out low expressing transcripts of <0.01 RPKM, and comparing average RPKM for each transcript between groups, it is observed that tSMS yielded linear quantification (slope = 0.98) of gene expression over ~22 log2 orders of magnitude for more than 95,511 transcripts (Figure 1).
Case Controlled Differentially expressed genes (DEGs). Employing the 9 filters approach, 391 transcripts were ranked in the top 100 by at least one method. Of these, two transcripts were identified by all 9 methods, and 5 transcripts were identified by 8 of the 9 methods. Note that although this reduces the likelihood that their differential expression is dependent on only one analytical approach, it does not fully eliminate potential type I error because these are not completely independent methods (Figure 2). The full 391 gene list by p-value in the 9 methods can be found in Supplementary Table 1. Jaccard similarity analysis reveals the best concordance between methods edgeR, DESeq and DESeq2. The most discordant results were obtained with BaySeq (Supplementary Data 2). The 2 transcripts passing all 9 methods with a corrected p<0.05 are difficult to interpret: IGLV2-8 (↑2.0X in ADHD) is an immunoglobin lambda chain V-II region, and its expression appeared related to at least 10 other DEGs that were either HLA or IgG-related. More difficult to interpret is RNU1-94P (↑2.0X), which is a small nuclear protein pseudogene. The five transcripts passing 8 filters included 4 with better annotation: ABCB5, CWC27, IFI35, and AHNAK:
ABCB5 (↑1.9X in ADHD) is a member of the multi-drug resistance (MDR) family of transporters. ABCB5 has been mechanistically linked to glucose, phospholipid, and amino acid transport , and copy number variants have been linked to childhood obesity . ADHD is also associated with increased risk of obesity, with mechanisms unknown .
CWC27 (↓1.4X) is a spliceosomal trans-peptidyl-isomerase that has been associated with retinal abnormalities and developmental disorders, including neurological defects, in children with mutations .
IFI35 (↑1.4X) is an interferon-induced protein that acts as a ‘damage-associated molecular pattern (DAMP)’ , and thus could indicate some type of inflammatory source in ADHD. Substantial circumstantial data suggests inflammation may play a role in ADHD .
AHNAK (↑1.3X) is a particularly interesting target because it is a giant 680 kD neuroblast differentiation-associated protein, that has been associated with a range of relevant neurological disorders including bipolar disorder , depressive-like behaviors in knockout mice , ß-adrenergic regulation of the cardiac CaV1.2 calcium channel , and a variety of immune functions. The AHNAK family member, AHNAK2 (↑1.9X), is also identified on this list, with DEG identification by 2 of the analytic methods.
The fifth transcript, RP11-35015.2 (↓1.6X) is a poorly annotated transcript that lies within intron 1 of the IGF1 receptor (IGF1R), and thus, difficult to more clearly understand.
The complete list of 391 selected (Supplemental Table 1) includes other interesting transcripts, such as BACE2 (↓1.6X) and MED6 (↑1.5X). However, we proceeded to further narrow this case-control list by virtue of analyzing the cohort of discordant twins, and then determining whether any systematic patterns of similarity emerged.
Discordant Twins Cohort
tSMS of 16 discordant twin pairs produced transcript profiling of similar breadth and linearity as observed in the case-control study (Suppl. Figure 1). Using an essentially identical analytical approach to the case controls (Figure 2), the results of RNAseq from monozygotic ADHD-discordant twins were subjected to 9 analytical approaches and then the top 100 transcripts from each were ranked by their presence on the 9 lists. The resulting list of 385 transcripts can be found in Supplementary Table 2.
Discordant twins DEGs: A total of 10 transcripts passed 8 of 9 filters and contains transcripts with close similarity to several of the case-control DEGs (Figure 2). These high-ranking transcripts present potential hypotheses for further study with regard to ADHD, as follows:
ARL6IP5 (↓1.4X) is an ADP ribosylation factor-like GTPase 6-interacting protein, but is also known as JWA, a homologous gene of the glutamate-transporter-associated protein 3-18 (GTRAP3-18), and addicsin. ARL6IP5/JWA is expressed at high levels in the hippocampus and ARL6IP5/JWA knockout mice showed spatial cognitive potentiation and enhanced neurite growth in newborns . Conditional astrocytic ARL6IP5/JWA null mice demonstrates a role as a neuroprotective factor against dopaminergic neuronal degeneration . ARL6IP5/JWA has been associated with increased expression in the amygdala after chronic morphine treatment , and with morphine dependence via the delta opioid receptor .
CCDC107 (↓1.9X) is closely related to CCDC132 and CCDC84 found in the case control study. While relatively little is known about these coiled coil family members, coiled coil helix proteins (e.g. Chchd2) have been implicated in ADHD-like mouse models .
CCND1, cyclin D1 (↓2.9X), is related to CCNC and CCNL2 from the case control studies. While the cyclins are largely studied in relationship to cell cycle control, they can serve a variety of regulatory functions in cells.
DBF4B, DBF4 Zinc Finger B (↑2.5X), has been extensively studied as an activator of the Mcm2-7 helicase, a partner to Cdc7 kinase, and thus important for the initiation of DNA replication. Potentially of interest, it has also been associated with autism spectrum disorders via a semaphorin 5A (SEMA5A) eQTL network .
Dual specificity phosphatase 4 (DUSP4, ↓3.2X) is a family member to DUSP6 from the case-control study, and has recently been described as a control element in the suprachiasmatic clock network via modulation of vasoactive intestinal peptide signaling to ERK1/2 .
FAM159A (↑1.7X) has counterparts FAM104A, FAM134B, FAM157C, FAM162A, and FAM213B as differentially expressed in the case-control study. Little is known about FAM159A, but FAM134B, aka RETREG1 reticulophagy regulator 1, has a substantial literature connecting it with various functions including autophagy, and sensory neuropathy in humans  and Border Collies . Inhibition of FAM134A causes impaired proteostasis in the endoplasmic reticulum due to the accumulation of misfolded proteins, which has been implicated in vascular dementia . FAM162A is associated by GWAS to a gene-by-alcohol dependence interaction study of risky sexual behaviors and so it could be related to behavioral control . Coincidentally, ADHD is associated with increased sexual risk taking .
RARS2 (↓1.6X) is the arginyl-tRNA synthetase gene that has been associated with a spectrum of neurological disorders including myoclonic epilepsy, mental retardation, spasticity, and extrapyramidal features . Patients with RARS2 mutations exhibit early onset hypotonia, epileptic seizures, encephalopathy, and feeding difficulties in a syndrome termed pontocerebellar hypoplasia type 6 (PCH6).
RN7SL454P (↓1.95X), has counterparts RN7SL423P and RN7SL687P as DEG in the case control cohort. It appears to be a small non-coding transcript, intronic to the dynein axonemal heavy chain 17 gene on chromosome 17 (DNAH17), but with no known function.
SIK3.IT1 (↑2.1X) is salt-inducible kinase 3, with known relations to sleep and circadian rhythm, and to glucose and lipid homeostasis, steroidogenesis, and adipogenesis .
SPICE1(↑1.4X) is a spindle and centriole-associated protein, which might relate to DBF4B and CCND1 in regards to cell cycle control. Computational screening identifies it as an aurora kinase substrate and it is known to cooperate with CEP120 in centriole elongation. Interestingly, SIK3 also interacts with aurora A, aurora B, and polo-like kinases, and SIK3 repression enhances the antimitotic effect of aurora inhibition . Likewise, CCND1 has known interactions with aurora kinases . The coiled coil proteins, potentially including CCDC107, are commonly associated with the centrosome maturation and aurora kinases , suggesting several possible coregulatory scenarios for SPICE1, CCND1, DBF4B, and SIK3 in ADHD, potentially in a non-mitotic, but centriolar/aurora kinase-mediated control of gene expression.
DEGs common to both cohorts. From the 385 DEG list compiled from 9 analyses of the discordant twins (Suppl. Table 2), 6 transcripts are identical to the 391 DEG case-control results obtained by similar methods (Suppl. Table 1). While 6 identical matches between 2 different cohorts of ADHD subjects could occur by random chance (Fisher’s exact text p=0.318), it does suggest that these transcripts may merit further analysis as hypotheses in future studies.
HLA.DQB1.AS1, as the name implies, is an antisense transcript to the HLA-DQB1 locus on chromosome 6, which is elevated about 2-fold in ADHD cases. As noted previously, particularly in the case-control study, a substantial group of transcripts were HLA or IgG related, implying that some type of immune defect is at work. Because whole blood is being profiled, one must be cautious about an over-representation of immune-related transcripts (which are very plentiful in whole blood), but conversely, one cannot dismiss immune involvement as noted earlier. The potential role of inflammatory factors in ADHD has been raised over the years and is supported by various circumstantial data as recently reviewed , and recent analysis suggests the potential role of HLA loci in neurodevelopmental disorders such as ASD, and to a lesser degree ADHD .
In the same vein, IGHV3-74 is the variable region heavy chain transcript involved in antigen recognition by encoding IgM antibodies. While speculative, increased levels in the 2 cohorts could suggest some type of immune or autoimmune activity in ADHD.
The regulator of G protein signaling RGS2 is increased in both cohorts. RGS2 has diverse actions including promoting the translation of stress-associated proteins ATF4 and CHOP via an eIF-2B inhibitory domain . Of potential importance, RGS2 variants have been associated with childhood adversities as predictors of anxious and depressive responses , as well as the regulation of nicotine-induced anxiolytic activity in mice, and cocaine-induced rewarding effects [55, 56]. Likewise, RGS2 is thought to mediate the anxiolytic effects of oxytocin , and affects T cell activation, anxiety, and male aggressive behavior . RGS2 knockout mice exhibit increased fear learning, spatial learning, and neophobia . Further, RGS2 modulates the activity and internalization of the dopamine D2 receptor in neuroblastoma cells , and has been implicated in dopamine receptor signaling during amphetamine self-administration .
Of potential interest is Park7 RNA (DJ-1), which is extensively investigated as related to early onset Parkinson’s Disease . Based on some symptomatic similarities between Parkinson’s and ADHD, especially impulsivity , it was suggested there may be shared underlying causative factors. However, the circulating plasma protein levels of Park7 were not associated with ADHD in 125 ADHD patients versus 66 healthy controls , although whole blood RNA levels were not assessed.
Changes in RNU1-14P, which is a small nuclear pseudogene, is quite difficult to interpret, as is the RP11-661A12 transcript, though the latter is potentially an upstream ORF or alternate 5’ exon for the zinc finger CCCH-type containing 3 (ZC3H3) transcript, which is involved in nuclear adenylation and export of mRNAs .
Pathway Analysis. An additional set of transcripts had very similar isoforms reported in the case control results, for example, MED7 vs MED6, CLIC2 vs CLIC1, JRK vs JRKL. While there is no assurance that these close family members perform similar functions, it is worth considering whether a similar pattern is reflected. A list of 66 of these overlapping transcripts was submitted for an unbiased analysis using pre-curated relationships between the gene products (Ingenuity Pathway Analysis). Several plausible relationships are identified in a manner that could identify latent variables that might account for a substantial subset of the transcript variations. Statistically, the top pathway identified centered around the well-characterized Akt/Insulin/PI3kinase/NfKB axis, as shown in Figure 3. Underlying changes in glucose to insulin signaling could drive broader changes into the MED6/MED7/CCNC pathway as well as VAMP8/VAMP3/MMP/NDUF pathway. A second, and related, high scoring pathway is the Erk pathway, which would explain the CCND1/CDKN2B/CDKN2C/CDKN1C/RGS2 changes, and also the S100A12/S100A4/S100A8/CAPN1/DUSP4/DUSP6 transcript alterations (Figure 4).
RPKM analysis of the case-control and discordant twin datasets. In the context of a hypothesis-generating, exploratory study such as this, the prior analysis using 9 DEG methods may risk missing biologically important pathways in favor of statistical rigor. The datasets were re-analyzed using an RPKM threshold of 0.01, and combined fold-change (>1.5) and p-value (<0.01, uncorrected) filtering approach that has proven useful in prior biomarker studies [66, 67]. This triple filter identified 524 transcripts in the case-control study (Suppl. Table 3), and 505 transcripts in the twin cohort (Suppl. Table 4). By filtering for transcripts that were common to both datasets at the gene symbol level, 14 transcripts were identified, but 3 could be excluded because the direction of the changes were in the opposite direction. The remaining 11 transcripts, common by both their presence and direction in both twins and case controls are potentially interesting.
ACP2 (↑2.0X twins) is a lysosomal acid phosphatase that is known to play a vital role in the removal of mannose-6-phosphate residues. ACP2 has known or suspected roles in several neurodevelopmental disorders, as emphasized by mutations in Acp2 causing severe cerebellar and neurodegenerative diseases . Integrated analysis of GWAS and expression data identified ACP2 as a loci related to prepulse inhibition, a measure of sensorimotor gating that is known to be affected in several psychiatric disorders .
ALKBH6 (↑2.8X twins) is potentially important because, while relatively little is known about it, by analogy to its homolog ALKBH5, it is likely to function as a methyl-N6-adenosine (m6A) demethylase . While there are extensive investigations into DNA modifications, such as CpG methylation, as a mode of genetic regulation, a quickly escalating literature suggests that defects in RNA modifications are a contributing factor in neurodevelopmental , and other disorders .
ASPSCR1 (↑2.6X twins), is a UBX domain containing tether for SLC2A4, which has a known fusion protein to TFE3 that is involved in certain cancers. However, better known as TUG, it has important roles as an interactor with the glucose transporter GLUT4, with regulatory activity over insulin-regulated aminopeptidase (IRAP) and vasopressin secretion . While complex, vasopressin has been associated with ADHD by virtue of its known relation to social behaviors, and has been investigated as a potential therapy .
CLYBL (↑2.6X twins) is citrate lyase beta-like transcript, which encodes a malate/ß-methylmalate synthase with known effects on Vitamin B12 levels . Vitamin B12 was thought to have a role in ADHD, but supplementation studies have not reported consistent beneficial effects . The role of malate/ß-methylmalate in human physiology is incompletely studied, but methyphenidate treatment in rats causes significant changes in the citrate, malate, and isocitrate synthetic enzyme levels in the brain .
GAK (↑2.1X twins), cyclin G associated kinase, is potentially interesting in relation to ADHD. GAK (auxilin-2) has known involvement in synaptic function and neurological diseases , and is associated by GWAS with overlapping properties of Parkinson’s Disease and autoimmune diseases . GAK was elevated in both cohorts (Figure 5A) and in 14/16 of the discordant twin pairs, often in fairly striking fashion (Figure 5B). GAK mRNA expression across a range of human tissues shows relatively high expression in the cerebellum, about twice the level observed in whole blood (Suppl. Figure 2, GTEX).
GALE (↑2.9X twins), UDP-galactose-4-epimerase, is one of 3-4 key enzymes in the synthesis and utilization of galactose, and changes in the other members of this family, especially GALT and GALK, were noticeably affected in the ADHD cases, with all 3 of these enzymes in the galactose processing pathway being elevated in the ADHD-affected twins (Figure 6).
GIT1 (↑2.6X twins) was elevated >2-fold in the ADHD subjects in both the discordant twin and case-control cohorts (Figure 7, upper panel). Several striking coincidences draw attention to GIT1 as potential target. First, of the 15 known GIT1 isoforms, the changes in both cohorts seems largely restricted to a single isoform (uc060djr.1), which was elevated in 12 of 16 discordant twin pairs (Figure 7, lower panel). GIT1 SNPs were previously associated with ADHD by genome-wide association studies (GWAS) studies that employ a relatively unbiased view of known genetic variation , although other cohorts did not support this association . Fine mapping identifies an intronic SNP in GIT1 which causes reduced expression of GIT1 RNA and protein . Strikingly, GIT1 is extensively spliced (Figure 8), and the intronic SNP localizes to within 20 bp of 3’ terminus of the uc060djr.1 isoform identified in the present RNAseq analysis (Figure 7). GIT1 knockout mice have ADHD-like traits including a shift in the neuronal excitation/inhibition balance associated with a decreased glial GABA intensity , and behavioral correction with methyphenidate and amphetamine . Mechanistically, GIT1 is thought to play an important role in neurite outgrowth , synapse formation , and the turnover of ß2-adrenergic and other G-protein coupled receptors . GIT1 is expressed at relatively high levels (10-fold above blood) in most brain regions, tibial nerve, and the testes (Suppl. Figure 3, GTEX). While we cannot rule out a type I error, the present data suggests GIT1 merits further consideration as a factor in ADHD.
STAM2 (↑1.5X twins), signal transducing adaptor molecule 2, is a member of the endosome-associated ESCRT-0 complex that is highly expressed in neurons, especially in the cerebral and cerebellar cortex, hippocampus, and medial habenula . STAM2 regulates signaling via Jak2 and Jak3, which are directly involved in c-myc induction of IL-2 .
The remaining targets identified in both cohorts are more difficult to interpret. ERCC6L2 (↑2.9X twins) is excision-repair like 2, which has known relevance in cancer, but is difficult to connect with ADHD. HDLBP (↑1.3X twins) is highly relevant to high density lipoprotein metabolism, but has only a tenuous connection to ASD by virtue of a 2q27 deletion that causes reduced expression of HDLBP and 2 other genes . IDS (↑1.4X twins), iduronate 2-sulfatase, is highly studied in Hunter syndrome mucopolysaccharidosis , but has no known relation to ADHD. UBE2J2 (↑2.5X twins) directs the ubiquitination of hydroxylated amino acids in the ER, but has no reported connection to ADHD or other developmental disorders .
Correlation between ADHD Discordance and Gene Expression Discordance.
To further narrow candidate gene expression to potentially important correlates of ADHD, we moved from a categorical to a dimensional analysis of ADHD severity, building on evidence that ADHD functions like a trait in the population [90-92]. As explained earlier, ADHD severity scores (based on parent ADHD-RS raw scores) between the identical twins were compared to create a ‘discrepancy score’ for the twins. These scores were then ranked, with highest discrepancy (most different) being ranked 1, and then correlated to the difference in gene expression (fold change) between the paired twins, for the 505 RPKM list of transcripts (Suppl. Table 4). In an ideal scenario, the fold change would inversely correlate to the rank discrepancy (high fold change in gene expression, i.e. 10, associates with lowest numerical rank, ie 1, most discrepant). Negative correlations of r > -0.4 were observed for several transcripts of interest (boxed yellow, Suppl. Table 5), and closer inspection suggests they might have potential relevance to ADHD.
Among the highly correlated DEGs, RN7SKP194 (r=-0.60) bears some general similarity to the RN7SL454P target identified by the 9 filter approach, and discussed above. Both of these small nuclear pseudogenes are likely to have as yet unknown regulatory functions . SRP14, signal recognition particle 14, is potentially interesting because it is 5-7 fold lower in the ADHD twins, and it is part of a larger riboprotein complex thought to regulate translational arrest during protein synthesis in dendrites . GMFG, glia maturation factor gamma, is almost 16-fold lower in the ADHD twins and affects a diverse range of cell types. MICU1, elevated 3-fold in ADHD twins, encodes a Ca+2-sensing, regulatory subunit of the mitochondrial uniporter, and mutations in MICU1 cause a range of symptoms that include progressive extrapyramidal signs, learning disabilities, and fatigue . Among positively correlated transcripts, whereby the most discordant pairs showed the least fold-change in expression was GIT1 with r=0.585 and an average increase of 2.4 fold in the ADHD twins. Unfortunately, the changes in RNA levels are generally not perfectly correlated with changes in protein expression (r=~0.6) , and so this unexpected relationship may not be a significant impediment to GIT1’s relevance to ADHD.
Comparison to prior genetic studies
Prior exome sequencing of sporadic ADHD cases compared to sibling/parent triads identified ~8 interesting targets . Of those, exome mutations in TBC1D9 are a relatively close match to RNAseq expression changes in TBC1D17 observed in the present discordant twin pairs. This suggest a closer look at this family of proteins, likely to be important in vesicle transport, may be warranted. A second possible match is between exome mutations in WDR83, and expression changes in WDR45B in discordant twins, and WDR74 in case control subjects. Also, we observed some similarity to transcripts identified in ADHD by Liao et al , whereby transcripts MED8 and ARTN had suggestive p-values in our analysis.