Participants:
Data for this study came from two projects. One was the Consortium for Neuropsychiatric Phenomics (CNP), a study performed at the Semel Institute of the University of California Los Angeles (UCLA) to examine underlying genetic and neural factors and their links to three neuropsychiatric illnesses: schizophrenia, bipolar disorder, and attention-deficit hyperactivity disorder (ADHD). Genetic, cognitive and behavioral data were similarly collected in the Genetics of Impulsivity (GOI) project, performed at the University of Georgia and the University of Chicago. Ancestry was self-reported and genetically confirmed.
CNP Sample.36 Healthy control participants, ages 21–50, were recruited by community advertisements in the Los Angeles area and were only included if they identified as either “Caucasian, not of Hispanic or Latino descent,” or “Hispanic or Latino, of any race,” as per NIH racial and ethnic minority group guidelines (N = 1138; 731 White, 407 White Hispanic). Participants were excluded if they met the following criteria: neurological disease, history of head injury with loss of consciousness, use of psychoactive medications, substance dependence within the 6 months before screening, and a positive drug screen on the day of testing. In addition, smaller samples of people with diagnoses of schizophrenia, bipolar disorder, and ADHD (following Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition—Text Revision (DSM-IV-TR)37) were recruited using a patient-oriented strategy involving outreach to local clinics and online portals. In total, 996 healthy individuals, 53 participants with schizophrenia, 42 with bipolar disorder, and 47 with ADHD were evaluated. Diagnoses for all individuals followed the DSM-IV-TR, and were based on the Structured Clinical Interview for DSM-IV (SCID-I)38 supplemented by the Adult ADHD Interview (a structured interview form derived from the Kiddie Schedule for Affective Disorders and Schizophrenia, Present and Lifetime Version (KSADS-PL).39 Participants who were included underwent a neuropsychological battery and submitted blood samples for genotyping. All subjects gave written informed consent in line with the procedure approved by the Institutional Review Board at UCLA. Data from the CNP study have been reported in prior publications.40–56.
GOI Sample. 15, 57 A total of 934 Caucasian-ancestry participants 18–30 years of age were tested at two sites (40% at Athens, GA and 60% Chicago, IL). Inclusion criteria were English fluency, age 18–30 years, and self-reported Caucasian race and non-Hispanic ethnicity to minimize population stratification.58. Exclusion criteria were scores > 12 on the Alcohol Use Disorders Identification Test (AUDIT)59 or the Drug Use Disorders Identification Test (DUDIT).60 All participants were screened for recent alcohol or drug use via breathalyzer or urine drug test before testing. Another exclusion criterion was treatment over the last 12 months or self-reported current need for treatment for: depression, bipolar disorder, general anxiety, social anxiety, post-traumatic stress disorder, obsessive compulsive disorder, panic attacks/disorder, phobia, schizophrenia spectrum disorders, anorexia, bulimia, or binge eating. ADHD was not excluded in this sample although it was exclusionary in the CNP sample. DNA was collected via a saliva sample for DNA collection in an Oragene DNA kit (DNA Genotek Inc., Kanata, ON, Canada).
Balloon Analogue Risk Task. The BART is a computerized behavioral measure of risky decision-making.4 Virtual balloons are presented on a computer screen, one balloon per trial, and the participant can “pump” the balloons up by pressing a response key, virtually inflating the balloons. Each pump produces a set increase in an amount of money (e.g., 5 cents per pump) or points earned on that trial. However, after a certain number of pumps, determined probabilistically, the balloon explodes, and the trial yields no money or points. The participant must decide when to “cash out” of a given trial, by pressing a response key, to retain earnings in a cumulative bank. The objective is for the participant to earn as much money, or as many points, as possible across the trials in the task. Versions of the BART vary with respect to the number of trials/balloons used, as well as the probability of explosions (e.g., some tasks have used balloons with a single probability of explosion,4 while others have used different-colored balloons with different probabilities of explosion3). The primary dependent variable of the task is the mean or total number of pumps on trials in which the balloon did not explode; these have been termed ‘adjusted pumps’. The measure ‘adjusted pumps’ is preferred to the absolute number of pumps because explosions artificially restrict the range of pumping.61
The CNP version of the BART task, programmed in E-Prime 2.0, consisted of 40 total trials, with balloons that were colored red or blue (20 of each color). Red balloons were “high risk”, with the probability of explosion on each red balloon randomly selected from a range of 1 to 32 pumps; blue balloons were “low risk", in which the probability of explosion was randomly selected from a range of 1 to 128 pumps. The order of balloon color across trials was random. Participants received 5 points for each adjusted pump. The GOI version of the BART consisted of thirty balloons, associated with a probability of explosion selected from a range of 1 to 64 pumps. Participants in both studies did not receive payment for their performance.
Genetic Analyses. Genotyping was performed using the Omni Illumina 500,000 SNP chip. For all genotype data, markers were excluded for quality control if they had less than a 95% genotyping rate, a minor allele frequency less than 1%, deviated significantly from Hardy Weinberg equilibrium (p < 10− 6), or were identified as having non-random genotyping failure (p < 10− 10). Individuals were excluded for missing genotypic data (< 2% genotypes), missing phenotypic data, or deviation from expected autosomal heterozygosity (Fhet < .2). To reduce spurious effects arising from poorly powered rare variants in these modestly sized samples, only SNPs with MAF greater than .20 were included in the analyses, thus emphasizing the inclusion of more reliable associations. Results were similar but slightly weaker when the traditional .01 cut-off was used. GWAS was performed on each of the CNP and GOI datasets as follows. Principal component analysis (PCA) was performed within study as well as joint with the 1000 Genomes (1KG) ancestry informative markers for use in QC and modeling efforts. Partial correlations (in R) were used in the polygenic scoring analysis to control for the population differences in the phenotype when comparing to the scored PCA-controlled GWAS. Plink62 was used to perform two linear regressions with Mean Adjusted Pumps as the dependent variable of interest, supplying sex, age and the first five PCA dimensions as covariates (Mean Adjusted Pumps ~ sex + age + 5 ancestry principal components). Each set of summary statistics was clumped and, along with the paired genotypes from its complement study, used to create polygenic scores for each individual in the target sample.63 As performed by the PRSice method, we tested multiple thresholds (in this case 500 possible thresholds between 0.001-0.5) by running a linear regression of the score at each threshold (MeanAdjPumpsZ ~ SCORE@THRESHOLD + sex + age + 5 ancestry principal components) to determine the optimal threshold (the smallest p-value). The p-value obtained at the optimal threshold is corrected for multiple testing (500 potential thresholds) using the false discovery rate (FDR). Scores were then compared using a partial correlation analysis that controlled for the same covariates in the target dataset as in the source's GWAS. Imputation to 1KG Phase 3 was also performed on each dataset, and the same methodology was applied.
A MEGA-analysis GWAS was performed on the merged imputed genotypes of the CNP and GOI datasets. To account for the sample population differences, the MEGA-analysis included the population covariates of the respective sources while also covarying by the source factor itself. After standard QC measures (see methods above), PLINK was used to perform a linear regression per the following model (Mean Adjusted Pumps ~ gt + sex + age + study sample + 5 ancestry principal components). A quantile-quantile (Q-Q) plot of observed vs. expected p-values and Manhattan plot of the linear regression results were performed in R. Estimation of genetic variance of all SNPs was performed using the GREML method64 as implemented in GCTA (v1.92.4).65 Risk scores were then derived and the best MEGA-PRS was then tested for overlap with the single question self-report of risk-taking (“Would you describe yourself as someone who takes risks?”) in European UK Biobank participants (N = 436,236)19 and disease status in European samples from the 2017 public ADHD, Bipolar Disorder, Alcohol Use Disorder, and “ever/never” prior cannabis use datasets using PRS methods above. Public datasets representing Attention-Deficit Disorder (PGC & iPSYCH, N = 19,099 cases, 34,194 controls)66 and Bipolar Disorder (PGC, N = 20,352 cases, 31358 controls, effective sample size 46,582)67, a non-psychiatric control phenotype (PGC Inflammatory Bowel Disease, which is a combination of Ulcerative Colitis and Crohn’s disease PGC datasets), as well as Alcohol Use Disorder (UK Biobank AUDIT, N = 121,604)23 and prior cannabis use (UK Biobank and ICC, N = 53,179 cases, 131,586 controls, effective sample size 151,493)28 were downloaded and summary statistics were extracted in order to construct PRS models. For each disorder, a PRS was constructed and tested for prediction of BART performance in our MEGA-analysis combined sample. Similarly, UK Biobank analyses were conducted using the summary statistics as reported by Clifton and colleagues19 to evaluate a shared genetic propensity for risk-taking between self-report and BART performance in CNP, GOI and our MEGA samples according to the PRS methods above.