Participants, dataset 1
The first dataset was used for functional independent component analysis (fICA). EEG recordings of participants potentially eligible were collected from September 2013 until September 2018 at Ziekenhuis Netwerk Antwerpen (ZNA), a large community hospital in Antwerp, Belgium. The study was approved by the Institutional Review Board of ZNA. We abided by the principles of the Declaration of Helsinki. A total of 1,195 adult participants – 1,132 psychiatric patients and 63 healthy controls – were included and provided written informed consent. Exclusion criteria for all participants were age < 18 years, inability to give informed consent for whatever reason, restlessness that could interfere with the EEG, and inability to sit still. Only patients who were hospital-admitted for stabilization and/or treatment of a psychiatric disease (no further selection was made) were included to allow collection of a representative and heterogenous cohort of (severely ill) psychiatric patients. Healthy controls were defined as having no current psychiatric episode and never been treated by a mental health service. After preprocessing, the total sample for fICA consisted of 1,123 (1,061 patients and 62 healthy controls).
Participants of the rTMS study, dataset 2
The second dataset was used for validation purposes and the evaluation of treatment effects. It consisted of 196 patients, diagnosed with non-psychotic MDD or dysthymia and BDI-II ≥14 at baseline, receiving protocolized rTMS treatment concurrent with psychotherapy. All participants provided written informed consent. Only participants receiving high-frequency TMS (10 Hz left dorsolateral prefrontal cortex, DLPFC) or low-frequency TMS (1 Hz right DLPFC) were included; participants receiving both 1 Hz and 10 Hz sequentially were excluded since this would make interpretation of results difficult. All patients completed at least 10 sessions of treatment, and filled in the BDI-II at baseline and at the last session (on average session 21). Details about this sample are described elsewhere.49,77
Participants of the medication study, dataset 3
The third dataset used for validation purposes and the evaluation of treatment effects was an international multi-center, randomized, prospective open-label trial (phase-IV clinical trial), or iSPOT-D sample (International Study to Predict Optimized Treatment in Depression). This study consisted of 1,008 patients diagnosed with non-psychotic MDD who were subsequently randomized to escitalopram, sertraline, or venlafaxine. All participants provided written informed consent and this study was approved by the institutional review boards at all of the participating sites and this trial was registered with ClinicalTrials.gov under id NCT00693849. At baseline and after 8 weeks of treatment patients filled in the Quick Inventory of Depressive Symptomatology (QIDS). Only data from participants who completed 8 weeks of randomized medication treatment (‘per protocol’ sample) were included. Details about this sample have been published elsewhere.35,78
DNA isolation and genotyping (dataset 1)
DNA was isolated and genotyped for 887 participants who provided informed consent for DNA extraction and analyses. One 10 ml ethylenediaminetetraacetic acid (EDTA) tube was filled during standard blood draws at the ward. DNA was extracted in the clinical molecular genetics laboratory of the University Medical Center Utrecht (UMCU). Samples were brought to a DNA concentration of 50 ng/μl with a total concentration of 200 ng DNA per participant. Subsequently, samples were sent in two batches to the Human Genotyping Facility of Erasmus Medical Center (Erasmus MC) Rotterdam for Global Screening Array v.1 (GSA) by Illumina, Santa Clara (California), USA, that has excellent validity and reliability.79
Genetic quality control (dataset 1)
Quality control (QC) and principal component analysis (PCA) were done with PLINK 1.9,80 and performed on two batches separately (see Supplementary information). Pre-imputation involved the creation of a superset with the highest quality SNPs for subsequent sample QC. The superset of SNPs was created by excluding those with genotype call rates <0.01, minor allele frequencies (MAF) <0.1, Hardy-Weinberg equilibrium (HWE) <10-4, and linkage disequilibrium (LD) r2 >0.2, with a window size of 50 and window shifting of a step size of 5. Using the superset, subjects were removed who: 1) had a mismatch in their sex between reported and genotyped; 2) were too extremely hetero- or homozygous (their F-values differed ≥3 SDs from the mean in the whole cohort); 3) were related (their pi-hat was above 0.1: one of each pair was randomly excluded); and 4) were cohort outliers (had values for the first two ancestry principal components (PCs) that deviated ≥3 SDs from the mean of the whole cohort).
This was followed by a regular SNP-level QC for exclusion of ill performing SNPs: variants with genotyping rate <0.01; MAF < 0.01; HWE p-value <10-5 were thus excluded. European ancestry was checked by comparing with the HapMap population: individuals were removed who deviated ≥3 SDs from the first and second genetic ancestry PCs from the Northern and Western European (CEU) population.
Lastly, before imputation, genotypic data was compared with the Haplotype Reference Consortium panel. Strands, alleles, positions and frequency differences were checked. Chromosomes were pre-phased and imputed using the Michigan Imputation Server.81 Post-imputation QC was performed to include reliable SNPs: variants that had a MAF >0.05 and LD r2 ≥0.8 were included, resulting in 5,211,700 SNPs available to generate PRS.
Polygenic risk score calculation (dataset 1)
The polygenic risk score (PRS) for MDD per individual was generated for those SNPs and individuals remaining after QC. The summary statistics of MDD (meta-analysis of PGC and UK Biobank82) were used to generate PRSs.83 That dataset has no sample overlap with the current dataset as Belgian individuals had not been included in the large MDD GWAS. If only odds ratios (ORs) were reported in the summary statistics, ORs were log-converted to beta values as effect sizes. To that end, the beta values, effective allele, and p-values were extracted from all summary statistics.
SNPs that overlapped between the summary statistics GWASs (training datasets), 1,000 genomes (reference), and our dataset (target) were extracted. Then, insertions or deletions, and ambiguous SNPs, were excluded. To account for complicated LD structure of SNPs in the genome, these SNPs were clumped in two rounds using PLINK 1.90b3z,84 according to previously established methods;85,86 round 1 with the default parameters (physical distance threshold 250 kb and LD threshold (r2) 0.5); round 2 with a physical distance threshold of 5,000 kb and LD threshold (r2) 0.2. Additionally, we excluded all SNPs in genomic regions with strong or complex LD structures. Sample overlap between training datasets with our target samples is unlikely since all samples belong to different cohorts and to our knowledge no Belgians had been included in the aforementioned GWASs.
We constructed PRSs based on risk alleles weighted by their effect sizes estimate using PLINK’s score function for 12 GWAS p-value thresholds (PT)81,87: 5×10-8, 5×10-7, 5×10-6, 5×10-5, 5×10-4, 5×10-3, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 and 1.
EEG recordings and preprocessing
During EEG recordings, subjects were seated in a sound and light attenuated room that was controlled at an ambient temperature of around 22°C. The participants were instructed to sit still for the duration of the recording without thought instructions. The operator did not intervene when drowsiness patterns were observed in the EEG.
Resting-state eyes closed EEG recordings for dataset 1 were acquired from 65 channels of the Electrical Geodesics Incorporated (EGI; Magstim, UK) system (dataset 1) and from 26 channels (10-20 electrode international system: Fp1, Fp2, F7, F3, Fz, F4, F8, FC3, FCz, FC4, T3, C3, Cz, C4, T4, CP3, CPz, CP4, T5, P3, Pz, P4, T6, O1, Oz, and O2) of the Neuroscan NuAmps (Compumedics, Australia; dataset 2 and 3). Data were recorded for three (dataset 1) or two (dataset 2 and 3) minutes during eyes closed condition. The sampling frequency was 500 Hz for most recordings, but 1,000 Hz for 6 recordings in dataset 1 (which were down-sampled to 500 Hz prior to further analyses). Data were referenced to Cz (dataset 1) or average mastoids with a ground at AFz (dataset 2 and 3). Horizontal eye movements were recorded with electrodes 61 and 64 (dataset 1) or electrodes placed 1.5 cm lateral to the outer canthus of each eye (dataset 2 and 3). Vertical eye movements were recorded with electrodes 5 and 62 (dataset 1) or electrodes placed 3 mm above the middle of the left eyebrow and 1.5 cm below the middle of the left bottom eyelid (dataset 2 and 3). Cartesian coordinates of the EGI system electrodes (dataset 1) were converted to spherical coordinates prior to EEG preprocessing.
Subsequently, the following steps were taken in the EEG preprocessing and artefact rejection procedure using Brain Vision Analyzer 2.0 (Brain Products, Germany): 1) data filtering: 0.5-90 Hz (dataset 1) or 0.3-100 Hz (dataset 2 and 3), and notch filter; 2) removal and spherical spline interpolation of noisy signals or flat lines; 3) electro-oculography (EOG) correction, using a regression-based technique88; 4) segmentation in 4-second epochs; and 4) artefact-rejection using an automatic procedure (criteria: maximal allowed difference of 150 µV peak-to-peak). This resulted in a minimum of one-minute data per subject.
LORETA-fICA model
The EEG was used for estimating the cortical source distribution of electric neuronal activity by means of LORETA (low-resolution electromagnetic tomography; free academic software available at https://www.uzh.ch/keyinst/loreta). This method weights minimum norm inverse solution, and localization inference is based on the standardized estimates of the current density.89
The following analysis steps were performed using the collection of 4-second artefact-free epochs obtained from dataset 1. In the first step, each EEG recording was transformed to the frequency domain, using the discrete Fourier transform. The cross-spectral matrices were obtained for six frequency bands, defined as: delta (1.5-3.5 Hz), theta (4-7.5 Hz), alpha (8-13 Hz), beta (14.5-30 Hz), low-gamma (31-47 Hz), and high-gamma (>70 Hz). Aiming to eliminate the notch bands used at different sites in the EU and US, the 48-69 Hz range was excluded. In the second step, from data of each cross-spectrum matrix, the spectral density was computed for each cortical voxel, sampled at 5 mm resolution in a realistic head model, using the MNI152 template.48 In the third step, the spectral-spatial data of all subjects was concatenated, and ICA was performed on this data, aiming to identifying independent spectral-spatial components (i.e. functional networks). This method was recently validated in Aoki et al. and Gerrits et al. and reliable identified DMN (default mode network) and TP (task-positive) networks.47,48
The typical ICA model assumes that the source signals are not observable, statistically independent and non-Gaussian, with an unknown, but linear, mixing process,90 and is described by the following formula:
x=As
where x, A and s represent matrices. In our case, these three matrices consisted of the following data:
- Matrix x with 1,123 rows corresponding to all subjects of dataset 1, and the data per subject consists of 37,434 (6,239x6) columns corresponding to the spectral power at 6,239 cortical voxels for the six frequency bands. This approach, using a priori determined frequency bands, is a unique feature of the method used.91
- Matrix s with 58 rows corresponding to the number of statistically independent components (i.e. functional networks), and 37,434 columns. In this way, each functional network contains 6 spatial images corresponding to the neural activity of each frequency band (i.e. in a cross-frequency manner).
- Matrix A with 1,123 rows and 58 columns. Thus, what remains of this data reduction for every subject is the amount of each component that was used for that subject. This amount is expressed as a loading (i.e. signed weight or score) per functional network for each subject.
Independent components
Each independent cross-frequency spectral-spatial functional network (fICA network or EEG component) represents sets of brain regions that are consistently activated or deactivated together within and across a given frequency band. The number of EEG components here was estimated from a measure related to Wackermann’s Omega Complexity,92 indicating 58.2 dimensions, hence the LORETA-fICA analysis was constrained to 58 components explaining 96.0% of total EEG power variance, which is a great reduction of data.
To visualize the functional networks (i.e. correlation of brain regions that are consistently activated or deactivated), a threshold at 3 z-values was set. Individual loadings per fICA network were obtained for each subject, corresponding to the strength of that network for a given individual subject.
The functional networks that were established based on the first dataset, were prospectively applied to dataset 2 and 3. Likewise, for each subject in each dataset, EEG component loadings were obtained per network. These loadings were used in the statistical analysis.
Outcome measures
For biomarker identification (discovery, Figure 1 main text) PRS-MDD was examined for association with independent EEG components (dataset 1). The primary outcome for the prediction analysis (validation, Figure 1 main text) was dimensional improvement of depressive symptoms, based on the BDI-II for rTMS sample (dataset 2) and the QIDS for iSPOT-D sample (dataset 3), which are both self-report questionnaires. In a secondary analysis we focused on categorical improvement defined as response or remission. Self-reports were taken at intake and after treatment completion (on average at session 21 for rTMS and at week 8 for antidepressants). Response was defined as ≥50% reduction of baseline score, while remission was defined as a score of ≤12 on the BDI-II93 (dataset 2) or ≤5 on the QIDS94 (dataset 3).
Statistics
SPSS version 26 was used for statistical analyses. Effects sizes (ES) of significant main effects are reported as R2 for continuous measures or as Cohen’s d (d) for binary measures. Two-sided tests were performed for statistical significance testing.
A priori, analyses were stratified by sex, since previous iSPOT-D studies reported sex-specific predictors of treatment outcome,35,41,42,78 and this would enable us to identify stratification biomarkers. Hence, sex was included as main factor or women and men were analyzed separately if the analysis could not accommodate sex as main factor, rather than handled as covariate since covariation can only resolve quantitative – not qualitative – sex differences. If no sex interaction was found or the effect for both sexes was the same direction, men and women were analyzed together.
First, a discovery analysis was performed to examine if there was an association between one or more fICA components and PRS-MDD (dataset 1). To that end, a partial correlation analysis – controlling for age and the first five genetic ancestry principal components (PCs) on men and women separately – was run between all EEG fICA components loadings and the most stringent PRS-MDD p-value thresholds (PT=5.0×10-8 to PT=0.05) in order to choose the optimal PT, which is unknown a priori.83 The significance level was set to α<0.001, to obtain a highly significant result and reduce the likelihood of false positives and non-reproducibility of findings.95
Second, validation analyses were performed (dataset 2 and 3) to examine if the EEG component that was found to be significantly associated with the PRS-MDD in the discovery analysis, was predictive of treatment outcome (dataset 2 and 3). The significance cut-off for the analyses following up was set at conventional α=0.05 as these analyses were intended for validation of the findings in the discovery analysis. We investigated possible associations between individual EEG component loading and absolute changes in BDI-II score (dataset 2) and QIDS score (dataset 3). The absolute change (∆) was defined as the symptom severity score difference between baseline and at treatment completion. Therefore, for men and women separately, ∆BDI-II and ∆QIDS were regressed on EEG component loading, adding age and baseline self-report score as covariates. If significant associations were found, factorial ANCOVAs were run to investigate if the individual loadings on the EEG component were significantly different in remitting and responsive patients relative to non-remitters and non-responders, respectively. In dataset 2, treatment response/remission and sex were added as fixed factors; in dataset 3, treatment response/remission, sex and antidepressant subtype were added as fixed factors. Age was added as covariate in all models. For remission, baseline BDI-II in dataset 2 and QIDS in dataset 3 (i.e. clinical score) were added as additional covariate, since baseline severity was related to treatment remission, while response was not.49,78
Subsequently, to assess the predictive value of the EEG component, a discriminant analysis on treatment outcome was performed for datasets 2 and 3. Prior studies had already tested several psychological (personality, anxiety etc.), demographic and behavioral measures and their ability to predict response or remission in these samples, and failed to find robust and clinically relevant predictors.49,51 The baseline model consisted of 1) age predicting response and 2) age and baseline severity predicting remission. Then we tested whether the model performance improved when EEG component loading was added as a predictor. Also, a receiver operating curve (ROC) of both models was regressed on response and remission to establish the predictive value of the discriminant analysis. The positive predictive value (PPV) and negative predictive value (NPV) for predicting response and remission were assessed.
Finally, we attempted to make a construct, based on the data of both dataset 2 and dataset 3. In order to merge both samples, Z-scores of the ∆BDI-II and ∆QIDS were calculated. A clinical cut-off loading was visually determined based on a graph with dataset 2 and 3 merged, representing standardized ∆BDI-II and ∆QIDS against the EEG component loading. Based on this cut-off, a simulation was built in order to evaluate the clinical usefulness, by calculating the percentage response/remission improvement if only subjects above or below this cut-off were assigned to rTMS or sertraline treatment.