Participants
Participants included 22 bipolar patients (mean age: 29.1 years [SEM = 1.67], 17 females; Table 1), and 27 healthy participants (27.5 years [SEM = 1.18], 15 females). Bipolar participants were assessed by a consultant psychiatrist who confirmed the diagnosis of BD (I or II) using the structured clinical interview for the International Statistical Classification of Diseases and Related Health Problems (ICD-11)[49]. Patients included in the study were euthymic for at least 2.5 months before recruitment. Additional inclusion criteria were: most recent episode being depression, aged 18–50 years, absence of symptoms from other mental health conditions beyond BD, and no history of substance abuse. We assessed residual mood symptoms and cognitive performance using validated scales on mania, anxiety, depression and tasks on executive and general cognitive performance. See Table 1 for further details, and Supplemental Material for sample size estimates (minimum sample size estimated was 20 per group, to identify volatility effects at 0.8 power). The study was approved by the Institutional Review Board of National Research University Higher School of Economics and the Local Ethical Committee of the First Moscow State Medical University. All participants provided written informed consent.
Table 1
Group demographics and variables describing the measures of affective state, cognitive and executive function in euthymic bipolar disorder patients (BD, N = 22) and healthy control participants (HC, N = 27).
|
Bipolar group
(N = 22)
|
Healthy control group
(N = 27)
|
Statistics (Permutation test p-value and Bayes factor)
|
Age (years), M (SEM)
|
29.1 (SEM 1.67)
|
27.5 (SEM 1.18)
|
P = 0.4921, BF10 = 0.3472
|
Sex (females), n (%)
|
17 (77.2%)
|
15 (56%)
|
χ2 = 1.6559, df = 1, P = 0.1849; BF10 = 1.102
|
Duration of illness (years), M (SEM)
|
7.4 (SEM 1.04)
|
NA
|
NA
|
Last episode polarity, (%)
|
depression (100%)
|
NA
|
NA
|
Time in euthymia (months), M (SEM)
|
4.9 (0.68)
|
NA
|
NA
|
Bipolar II, n (%)
|
17 (77.3%)
|
NA
|
NA
|
Bipolar I, n (%)
|
5 (22.7%)
|
NA
|
NA
|
Antidepressant, n (%)
|
10 (45.5%)
|
NA
|
NA
|
Antipsychotic, n (%)
|
14 (63.6%)
|
NA
|
NA
|
Mood stabiliser, n (%)
|
16 (81.8%)
|
NA
|
NA
|
Summary: Taking psychiatric medication, n (%)
|
19 (86.4%)
|
NA
|
NA
|
HADS anxiety, M (SEM)
|
5.5 (0.94)
|
5 (0.64)
|
PFDR = 0.64, BF10 = 0.31
|
HADS depression, M (SEM)
|
4 (0.69)
|
3.30 (0.57)
|
PFDR = 0.4003, BF10 = 0.3824
|
Beck Depression Inventory, M (SEM)
|
8.3 (1.75)
|
5 (1)
|
PFDR = 0.1038, BF10 = 0.8668
|
Altman's Mania Scale, M (SEM)
|
1.7 (0.50)
|
2.6 (0.50)
|
PFDR = 0.2040, BF10 = 0.5652
|
State-Trait Anxiety Inventory (state subscale on MEG day)1, M (SEM)
|
35.5 (1.72)
|
33.7 (1.90)
|
PFDR = 0.4571, BF10 = 0.3833
|
Trail Making Test (Part 1), M (SEM)
|
23.3 (1.21)
|
21.2 (1.22)
|
PFDR = 0.2216, BF10 = 0.5296
|
Trail Making Test (Part 2), M (SEM)
|
87.8 (8.5)
|
55.7 (5.6)
|
PFDR = 0.0022*, BF10 = 14.907
|
Wisconsin Card Sorting Test, M (SEM)
|
29.3 (0.92)
|
31.6 (0.67)
|
PFDR = 0.0364, BF10 = 1.6560
|
Mini-Mental State Examination, M (SEM)
|
29.5 (0.18)
|
29.5 (0.17)
|
PFDR = 0.8678, BF10 = 0.2875
|
For MEG analysis, data were available for 27 HCs and 21 BDs. Values are provided as mean and SEM (M, SEM) or as count and percentage (n, %). Between-group differences in scale values were assessed using permutation tests. Our alpha significance level was set at 0.05, and we controlled the false discovery rate (FDR) at q = 0.05 for multiple tests. Permutation test p-values were complemented with Bayes factor analysis to assess the amount of evidence in favour of H1 or H0. Age and sex distribution were comparable across groups (Age: PFDR = 0.4921, no-significant differences; BF10 = 0.3472, providing anecdotal evidence for H0; Sex: Chi-squared statistic, χ2 = 1.6559, df = 1, P = 0.1849, no significant; BF10 = 1.102, no evidence for H0 or H1). There was substantial and anecdotal evidence supporting no differences between BD and HC groups in anxiety, depression, mania, and general cognitive scores (Bayes factor, BF10, in the range 1/3–1: anecdotal evidence; 1/10–1/3: substantial evidence). Between-group differences after FDR control were observed exclusively for Part 2 of the Trail Making Test (denoted by *), assessing task switching, based on strong evidence. There was anecdotal evidence for differences in Wisconsin Card Sorting Test scores, assessing executive functions. However, this effect was not significant after FDR control. 1State anxiety scores on the day of the experimental session were available for 22 bipolar patients and 18 healthy participants. Abbreviations: HADS, Hospital Anxiety and Depression Scale. The type of mood stabiliser taken by bipolar participants was mainly Lamotrigine (n = 11), but a few patients were taking Carbamazepine (n = 2), Valproic acid (n = 2), or Lithium (n = 1). See Supplementary Materials for further details on the scales and the corresponding references.
Reward-based motor sequence learning task
Participants underwent an initial fine motor control assessment, then completed a validated motor-based decision-making paradigm[48] (Fig. 1A), which combines probabilistic binary reward-based learning within a volatile setting (reminiscent of reversal learning) with the execution of motor sequences to express decisions. Participants learned two sequences of four finger presses, followed by a 320-trial test phase. In each trial, they were required to choose and perform one of the sequences to potentially earn a reward (5 points; Fig. 1A). Reward probabilities for sequences were reciprocal (p, 1-p) and changed pseudorandomly (Fig. 1B). The aim was to infer the reward probability associated with each sequence ('action-outcome' contingencies henceforth) and adjust their choices considering changing contingencies. Accumulated points translated to monetary rewards. See timeline in Fig. 1C. The task, programmed in MATLAB using Psychtoolbox, recorded participants' keypress timings to evaluate reaction time (RT) and performance tempo (Fig. 1D). See Supplemental Material.
MEG recording and preprocessing
MEG was performed using a 306-channel system (Elekta Neuromag VectorView), with head movements tracked by a head position indicator with four coils. Concurrently, ECG and EOG were recorded for MEG artefact rejection. Recordings were sampled at 1000 Hz and filtered between 0.1–330 Hz. MEG preprocessing involved head movement correction, noise reduction, and channel selection using standard methods ([50]; Elekta Maxfilter software; Supplemental Material). MEG data was further processed using MNE-python[51] (Python version 3.11.5) and custom Python scripts, lowpass filtered at 125 Hz, downsampled to 250 Hz, with a notch filter applied at 50 and 100 Hz. Independent component analysis (FastICA) removed eye and heart artifacts (3.3. ICs on average per participant).
General task performance
General task performance was assessed using the win rate (rewarded trials), average tempo (mean inter-key press interval in a sequence, mIKI), and RT (time from stimulus to first key press).
To assess practice effects in performance tempo and RT over trials as a function of the group, we employed Bayesian multilevel regression modelling (BML) with the logarithm of trial-wise RT or mIKI (in milliseconds) as the dependent variables. BML was performed with the brms R package[52, 53]. For each dependent variable, models of decreasing complexity were constructed (Table S1). Details on model comparison can be found in the Supplemental Material.
Modelling decision-making behaviour using hierarchical Gaussian filters
To assess probabilistic learning in our task we used a validated hierarchical Bayesian model, the 3-level perceptual HGF for binary categorical inputs[23, 24](Fig. 2a). This model described how participants infer hidden states about the tendency of the action-outcome contingencies on trial k, x2(k)(level 2), and the rate of change in that tendency (log-volatility), x3(k). Level 1 represents the binary reward input. Gaussian belief distributions on levels 2 and 3 are represented by their posterior mean (µ2(k), µ3(k)) and posterior variance (uncertainty: σ2, σ3). Precision is the inverse variance or uncertainty, πi (i = 2, 3). Belief updating on each level i and trial k is driven by prediction errors, and modulated by precision ratios, weighting the influence of precision or uncertainty in the current level and the level below. This is termed precision-weighted PE, pwPEs. For level 2, belief updating takes the simple form:
$$\varDelta {\mu }_{2}^{k}={\mu }_{2}^{\left(k\right)}-{\mu }_{2}^{\left(k-1\right)}={\sigma }_{2}^{k}{\delta }_{1}^{\left(k\right)}$$
1
Thus, updating beliefs about the tendency of the action-outcome contingencies is proportional to the PE about action outcomes, δ1(k), weighted with the estimation uncertainty on that level, σ2(k). Here, pwPE is equal to σ2(k)δ1(k). See general equation, representing updates on level 3, in Supplementary Material, and [24].
States x2, x3 evolve as random Gaussian walks, with volatility states x3 directly influencing the time evolution of x2 through its variance (conditional on past values):
\({x}_{2}^{k}\sim N\left({x}_{2}^{k-1},{f}_{2}\left({x}_{3}\right)\right)\)
(2)
with (dropping k for simplicity)
$${f}_{2}\left({x}_{3}\right)\stackrel{\scriptscriptstyle\text{def}}{=}exp\left(\kappa {x}_{3}+{\omega }_{2}\right)$$
3
In (3), ω2 represents the invariant (tonic) portion of the log conditional variance of x2, while κ is a coupling constant that modulates the influence of phasic volatility, x3, on the size of action-outcome belief updates. The step size on level 3 is modulated by ω3, a high-level tonic volatility parameter. See Supplemental Material for modelling details.
To assess how beliefs mapped to decisions, we coupled this perceptual model to response models previously used in similar tasks[24, 37, 38]. First, we considered a unit-square sigmoid response model where choice probability is shaped by a free fixed (time-invariant) parameter ζ, interpreted as inverse decision noise: the sigmoid approaches a step function as ζ tends to infinity. This constituted our model M1. Model M2 was similar but employed a two-level HGF with constant volatility. M3 combined the 3-level HGF with a response model where the sigmoid function depends on the trial-wise prediction of log-volatility, \(\zeta ={e}^{-{\mu }_{3}^{\left(k-1\right)}}\) [22](Fig. 2a). In this model, higher estimates of volatility lead to a more stochastic mapping from beliefs to decisions. As a result, there is an increased likelihood of choosing responses that deviate from predictions, consistent with increased exploration (exploring whether the contingency has changed). In models M1 and M3, parameters ω2 and ω3 were free; ω2 was also free in M2. Additionally, ζ was free in M1 and M2, while initial values µ3(0) and σ3(0) were free in M3. A fourth model, M4, was constructed similarly to M3 but replaced the free parameter ω2 with κ[28].
Models were fit to individual behavioural data (series of chosen and rewarded sequences: responses and observed outcomes), using prior values described in Table S2. Log model evidence was used for model comparison using random-effects Bayesian model selection (Supplemental Materials). As in previous work, simulations were conducted to quantify how well free model parameters could be estimated[27, 37]. Relevant belief and uncertainty trajectories were used subsequently for our MEG analysis (Fig. 2b; see Results). The models were implemented as a part of the TAPAS toolbox[54]. We used the HGF release v7.1 in Matlab R2020b, and functions ‘tapas_ehgf_binary’.
Assessing motor invigoration
We additionally investigated whether trial-by-trial predictions about the probabilistic action-outcome mapping, \({\stackrel{\prime }{\mu }}_{2}^{\left(k\right)}\), differentially influenced motor performance in our groups. Following Tecilla et al.[48], given the sign of \({\stackrel{\prime }{\mu }}_{2}^{\left(k\right)}\) is arbitrary, we used as predictor for Bayesian multilevel modelling the absolute value,\(\left|{\stackrel{\prime }{\mu }}_{2}^{\left(k\right)}\right|\), representing the strength of those predictions, and assessed its association with performance variables (log-mIKI, log-RT). We hypothesised a negative correlation, suggesting that stronger expectation about reward contingencies speeds performance. We also hypothesised a greater sensitivity to these predictions (steeper slope) in BD compared to HC. Details of the models are in Table S3.
Source reconstruction of MEG signals
We reconstructed MEG signals using Linearly Constrained Minimum Variance beamforming (LCMV[55]) in MNE-Python and individual T1-MRI images for cortical divisions with Freesurfer 6.0[56, 57], http://surfer.nmr.mgh.harvard.edu/). We aligned MRI and MEG coordinate systems, selected the Desikan-Killiany–Tourville atlas for cortical parcellation (DKT[58]), and performed forward modelling with boundary element models[37].
We focused on alpha, and beta frequency bands, band-pass filtering signals between 1–40 Hz before LCMV beamforming. Theta-band activity was also examined given its robust association with feedback processing[34, 59], relevant for win/lose outcomes in our task. Gamma frequency analysis followed a similar process (30–124 Hz band-pass filter). Time courses were extracted for regions of interest (ROIs) associated with decision-making under uncertainty and reward processing[35, 60–65], and linked with impairments in fronto-striatal reward circuitry in [5, 10, 66, 67]. These included the (1) ACC, (2) OFC, including the ventromedial PFC, (3) dorsomedial PFC (dmPFC), (4) dorsolateral PFC (dlPFC). We also included the (5) primary motor cortex (M1) and (6) premotor cortex (PMC), to assess motor activity during decision-making[68].
Our study's ROIs comprised 16 bilateral labels in eight areas from the DKT atlas: (1) rostral and caudal ACC, (2) lateral and medial OFC (including vmPFC), (3) superior frontal gyrus (dmPFC, and supplementary motor area, SMA), (4) rostral middle frontal gyrus (rMFG), (5) precentral gyrus (M1), and (6) caudal MFG. Time series extraction utilised the PCA flip method in MNE-Python. Although the 'flip' operator was not relevant for our time-frequency analysis, it was essential for preparing the source-reconstructed time series for subsequent connectivity analysis. See anatomical label references in Supplemental Material.
Convolution modelling of time-frequency responses during outcome processing
We used a validated convolution-modelling approach to analyse frequency-domain amplitude changes related to belief updating and uncertainty following outcome presentation[36, 38, 69]. Building on previous work[37], this frequency-domain general linear model (GLM) included as parametric regressors the unsigned pwPE updating beliefs on level 2 (representing precision-weighted Bayesian surprise; the absolute value is preferred for the binary HGF where sign on level 2 is arbitrary[70, 71]), and uncertainty measures (σ2, σ3). It also included discrete regressors for win/lose outcomes and error trials. To avoid regressor collinearity and potential GLM misspecification, we excluded the level 3 pwPE[37, 38], due to its high linear correlation with the unsigned pwPE on level 2 (Supplementary Materials).
The GLM was applied to concatenated epochs of source-reconstructed data in our ROIs, using Morlet wavelets for time-frequency (TF) analysis in 4–100Hz and within − 0.5–1.8 s (Figure S1). We hypothesised between-group differences in gamma and concomitant alpha/beta activity during pwPE processing, and in alpha/beta activity during uncertainty encoding. Theta 4–7 Hz activity was assessed for the win/lose regressors[34, 59].
We conducted this analysis using SPM12 software (http://www.fil.ion.ucl.ac.uk/spm/), adapting original code by Spitzer et al.[72], as used in Hein et al.[37], with additional details available in the Supplemental Materials.
Frequency-resolved functional connectivity
To analyse directed functional connectivity between frequency-resolved activity in our ROIs, we employed time-reversed Granger causality (TRGC[73]; Supplemental Materials) as a robust metric for directed information flow[74]. Following the recommendations of Pellegrini et al.[74], we applied TRGC in the frequency domain to LCMV-based source-reconstructed time series from our 16 ROIs after the PCA flip transformation.
Our analysis focused on between-group differences in the directionality of information flow within the 8–30 Hz range during the 0.5–1 s interval of outcome processing for trials with large unsigned pwPEs updating beliefs at level 2. We employed a median split of unsigned pwPE values, yielding approximately 160 high-|pwPE| trials per participant. This frequency range was selected based on evidence that beta-band functional connectivity from the PFC effectively differentiates levels of predictability, exhibiting reduced values during unpredictable trials[34]. By examining TRGC in trials with high unsigned pwPE values, we anticipated a general decrease in beta-band TRGC in HC, in parallel with alpha/beta amplitude suppression during belief updating. We hypothesised that this pattern would be disrupted in BD. The TRGC analysis was conducted using the ROIconnect plugin for EEGLAB[74], adapted for our MNE-python LCMV outputs. See Supplemental Materials.
Statistical analysis
Between-group analyses of behavioural, computational, and TRGC-derived variables used independent-sample permutation tests (5000 permutations) in MATLAB®. Within-subject analyses used paired permutation tests. We maintained an alpha significance level at 0.05 and controlled false discovery rates (FDR) at q = 0.05 for multiple tests. Non-parametric effect sizes are reported as probability of superiority[75, 76] (Δ). Non-significant effects were further evaluated using Bayes Factors (BF10), interpreted following Wetzels and Wagenmakers[77].
Statistical analysis of source-level time-frequency images used cluster-based permutation testing in the FieldTrip Toolbox[78, 79] (1000 permutations). We averaged TF activity across frequency bins within each band (theta, alpha, beta; 60–100 Hz for gamma[37]). Temporal intervals of interest for statistical analyses were selected based on previous research[37, 38, 59]: 0.5–1.8 s for parametric regressors, 0.2–1 s for win/lose regressors. We controlled the family-wise error rate (FWER) at 0.025. See Supplemental Materials.