Methods
Because all five experiments were highly similar, we will describe them together to avoid redundancies. The hypotheses, procedures, outlier criteria, methods, and planned analyses of each experiment were preregistered on the Open Science Framework (OSF, osf.io/k8752/registrations). Raw data, scripts for the experiments, and analyses are available on OSF.
Participants
We collected data of 326 participants (161 female, 152 male, 3 diverse, 10 did not provide gender information; age mean = 29, range: 18-72) in five experiments (N1 = 45, N2 = 60, N3 = 60, N4 = 61, N5 = 100). All participants were right-handed and German-speaking. Experiment 1 was conducted at the lab of the University of Freiburg testing a student sample. All other experiments were online experiments and participants were recruited via Prolific (Palan & Schitter, 2018). The sample size for Experiment 1 was based on a power analysis using the tool G*Power (Faul et al., 2007). We opted for a test-power of 1- β = .90, an alpha-error probability of α = .05 and an effect size of ηp2 = .18, which was reported for the c-CSE in the study of Dignath et al. (2019). Sample sizes of Experiments 2-5 all exceeded the calculated sample size of Experiment 1 and were determined using Sequential Bayes Factors (Schönbrodt et al., 2016)[1].
Participants with excessive error rates (≥75%) or error rates higher than 3 SD from that experiment’s sample mean were excluded and replaced (see Table 1).
TABLE 1. Data exclusion on participant and trial level.
Experiment:
|
1
|
2
|
3
|
4
|
5
|
Participant level
|
|
|
|
|
|
Error rate > 75 %
|
0
|
0
|
0
|
0
|
1
|
Error rate deviating > 3 SD from sample mean
|
0
|
1
|
1
|
1
|
1
|
Trial level
|
|
|
|
|
|
First trial of each block
|
0.8 %
|
0.8 %
|
0.8 %
|
0.8 %
|
0.4 %
|
Trials following error trials
|
7.1 %
|
5.6 %
|
5.9 %
|
5.8 %
|
7.8 %
|
Error trials (RT analysis only)
|
7.2 %
|
5.6 %
|
5.9 %
|
5.7 %
|
7.8 %
|
RT > 3 SD from participant’s sample mean (RT analysis only)
|
1.3 %
|
1.4 %
|
1.4 %
|
1.4 %
|
1.1 %
|
Task and Stimuli
The experiment was programmed in JavaScript using the library jsPsych (Leeuw, 2015) and followed closely the paradigm of Dignath et al. (2019). Each trial included the presentation of a fixation cross, a distractor stimulus, a blank, a target stimulus, and a response window (see Fig. 1). The distractor was displayed for 139 ms, followed by a blank screen for 35 ms and the target for 130 ms. In Experiments 1-4, distractors and targets were numbers between ‘3’ and ‘6’. In Experiment 5, they were numbers between ‘1‘ to ‘4‘ and ‘6 to ‘9‘. In congruent trials, the target stimulus was identical to the distractor stimulus, but different in incongruent trials. In every trial, the target stimulus was presented slightly smaller than the distractor stimulus. After target presentation a blank response window followed, which was terminated on response or after a maximum of 1701 ms. Participants were instructed to respond to the target stimulus by pressing the corresponding number button on the keyboard. In Experiments 1-4 participants used only their right hand (‘3’: Index finger, ‘4’: Middle finger, ‘5’: Ring finger, ‘6’: little finger). In Experiment 5, participants would react with their left hand to number stimuli in the range from ‘1‘ to ‘4‘ (‘1’: little finger, ‘2’: Ring finger, ‘3’: Middle finger, ‘4’: Index finger) and with their right hand to number stimuli in the range from ‘6 to ‘9‘ (‘6’: Index finger, ‘7’: Middle finger, ‘8’: Ring finger, ‘9’: little finger). If no or an incorrect response was registered, a red screen was displayed as error feedback for 201 ms. Trials were separated by a delay, i.e. the ITI, which was either ´short´ or ´long´. In the short ITI condition, the fixation cross was shown for 250 ms, while it was presented for 2000 ms (Experiment 1), 3000 ms (Experiment 2) or 5000 ms (Experiment 4 and 5) in the long ITI condition. In the long ITI condition of Experiment 3, a blank screen was shown for 2750 ms followed by a fixation cross shown for 250 ms (resulting in a total ITI of 3000 ms).
Additionally, we introduced a context manipulation. Distractor and target stimuli were shown either as an Arabic digit (e.g. ‘3’) or as the corresponding German word in capital letters (e.g. ‘DREI). In Experiment 5, digits and number words were assigned a font color (orange or blue) and a response hand (left or right). E.g., participants would react with their left hand to digits that were presented in orange font color. Distractor and target would always be presented in the same context.
Procedure
After providing informed consent, task instructions were displayed. The participants were instructed to respond as fast and as accurately as possible and to respond with their right hand only. If the error rate exceeded 40 % in the first ten trials of training, instructions were provided again. If participants failed this accuracy test again, the experiment was terminated.
To avoid confounds of stimulus-response memory (e.g., full or partial stimulus and response repetitions, negative priming or contingency learning) we used a confound-minimized design with two different stimulus-response subsets alternating across (see e.g. Jiménez & Méndez, 2013; Schmidt, 2013; Schmidt & Weissman, 2014; Spinelli et al., 2019) trials so that even trials would use different stimulus-response subsets than odd trials. In each block, each of the responses was paired two times with each level of congruency, previous congruency, context, and previous context resulting in a total of 128 trials per block. After a training block, participants performed eight experimental blocks. The ITI condition in the first block was randomly chosen, alternating from block to block thereafter. The ITI condition in the first block was randomized per participant. Participants were compensated with ca. 5 £/hr.
Analysis And Results
We decided to adjust the preregistered[2] analysis plan by switching from a frequentist to a Bayesian approach. Before the test of our main analysis, we validated successfully that the paradigm produced CSEs (see Appendix for the corresponding analyses; see also table 3).
To test our main hypothesis, we conducted a Bayesian ANOVA with the within factors of context transition [repetition vs change] and ITI duration [short vs long, and participant as random factor with CSE scores as the dependent variables. The CSE score indicates the difference between the congruency effect after previously congruent trials and the congruency effect after previously incongruent trials. It was calculated per participant and condition as CSE = (mean RTcon◊inc – mean RTcon◊con) – (mean RTinc◊inc – mean RTinc◊con). This analysis was repeated with mean error rates as dependent variable.
With this analysis approach, we tested the hypothesis, whether the size of c-CSEs is reduced for longer ITIs. Under the H1, we expected reduced c-CSEs for longer ITI conditions relative to shorter ITI conditions. Statistically, the H1 predicts a two-way interaction between context transition and ITI duration. Bayes Factors were calculated as \({BF}_{10} = \frac{p\left(data\right|H1)}{p\left(data\right|H0)}\), if BF10 > 1 and as \({BF}_{01} = \frac{p\left(data\right|H0)}{p\left(data\right|H1)}\), if BF10 < 1. Thus, BF10 indicates the likelihood ratio of the probability that the data would occur under the H1 compared to the probability that the data would occur under the H0 (e.g., BF10 = 3 indicates that it is three times as likely to observe the data under the assumption of the H1 model compared to the H0 model), whereas BF01 indicates the inverse (e.g., BF01 = 3 indicates that it is three times as likely to observe the data under the assumption of the H0 model compared to the H1 model). In all analyses, Bayes Factors for main effects were calculated against an intercept model for the H0 (e.g., for the main effect of context transition: H1 model = CSE ~ context transition + participant; H0 model = CSE ~ participant). Bayes factors for interactions were calculated by comparing posterior probabilities for a model including main effects and the interaction term against a model including only main effects, but no interaction term (e.g. for the interaction between context transition and ITI duration: H1 model = CSE ~ context transition + ITI duration + context transition:ITI duration + participant; H0 model = CSE ~ context transition + ITI duration + participant). We used the standard prior distribution for fixed effects of .5 for all analyses. BF10 < 3 and BF01 < 3 are considered indecisive. Error percentages of the Bayes factor estimated with 10,000 iterations of Monte Carlo sampling are reported (a Bayes factor of 10 with an error percentage of 50% can be expected to fluctuate between 5 and 15).
In accordance with to our preregistration, we excluded the first trial of each block, and all trials following error trials. For RT analysis, we also removed all error trials and trials with RTs deviating more than 3 SD from this participants conditional mean RT (see Table 1).
The results of the analyses of each individual experiment are described in Table 2.
Discussion Experiments 1–5
Experiments 1 to 5 tested whether the c-CSE becomes smaller with increased ITIs. Across the experiments we varied the duration of the longer ITI (2000–5000 ms), the filling of the ITI (Experiment 3 used an unfilled ITI; all other Experiments showed a fixation cross during ITI), and the type/amount of context features (in Experiments 1–4 the representation of the number stimulus varied; in Experiment 5 the representation of the number stimulus, the color of the number stimulus and the response hand varied). All five experiments remained undecisive in the test of our main hypothesis. Because all experiments tested the same hypothesis with very similar experimental designs, we decided post-hoc to pool the raw data of all experiments (total N = 326) and submit CSE scores to a mega analysis (also known as Integrative Data Analysis: Curran & Hussong, 2009; Eisenhauer, 2021; Hussong et al., 2013) to maximize test power, while keeping a more complex data structure than comparable meta-analytical approaches (Sung et al., 2014; Tierney et al., 2015). The mega analysis tested the hypothesis identical to that tested for each individual experiment, i. e. whether the c-CSE is reduced with longer ITI delays.
Table 2
Resulting Bayes Factors resulting from the Bayesian ANOVAs conducted on mean RTs and mean error rates of each experiment. Subscript indicates whether it is evidence in favor of the H1 (BF10) or the H0 (BF01). Decisive evidence is printed in bold. In brackets, Bayes factor error percentage is provided.
Experiment:
|
1
|
2
|
3
|
4
|
5
|
RTs
|
|
|
|
|
|
Context transition
|
BF01 = 1.769
(± 1.27%)
|
BF01 = 2.793 (± 1.29%)
|
BF01 = 5.559 (± 2.67%)
|
BF01 = 1.553 (± 1.32%)
|
BF10 = 2.073 (± 1.28%)
|
ITI duration
|
BF01 = 4.746 (± 1.84%)
|
BF10 = 4.727 (± 1.86%)
|
BF01 = 5.603 (± 1.04%)
|
BF01 = 7.085 (± 1.76%)
|
BF01 = 6.714 (± 1.81%)
|
Two-way interaction
|
BF01 = 1.789 (± 53.24%)
|
BF01 = 2.121 (± 52.46%)
|
BF10 = 1.421 (± 1.27%)
|
BF01 = 2.340 (± 53.18%)
|
BF10 = 1.241 (± 53.39%)
|
Error rates
|
|
|
|
|
|
Context transition
|
BF01 = 2.091 (± 0.83%)
|
BF10 = 1.788 (± 0.83%)
|
BF01 = 5.216 (± 6.65%)
|
BF01 = 7.328 (± 0.84%)
|
BF01 = 8.703 (± 0.84%)
|
ITI duration
|
BF01 = 4.993 (± 1.04%)
|
BF01 = 3.749 (± 1.02%)
|
BF10 = 4.021 (± 1.51%)
|
BF01 = 2.549 (± 1.03%)
|
BF01 = 8.739 (± 1.03%).
|
Two-way interaction
|
BF01 = 3.307 (± 10.33%)
|
BF10 = 2.778 (± 10.35%)
|
BF01 = 5.000 (± 6.98%)
|
BF01 = 4.672 (± 10.52%)
|
BF01 = 5.100 (± 10.63%)
|
Table 3
CSEs in RTs (ms) and error rates (%) and effects of context-transition on the CSE for all five experiments separated.
|
CSE in RTs (ms)
|
CSE in error rates (%)
|
Experiment:
|
1
|
2
|
3
|
4
|
5
|
1
|
2
|
3
|
4
|
5
|
Short ITI duration
|
|
|
|
|
|
|
|
|
|
|
Context repetition
|
48
|
52
|
46
|
33
|
27
|
1.2
|
1.0
|
2.0
|
1.4
|
1.5
|
Context change
|
35
|
41
|
32
|
24
|
23
|
3.1
|
1.0
|
1.7
|
1.5
|
0.9
|
Context-transition effect (c-CSE)
|
12
|
11
|
13
|
9
|
4
|
-2.0
|
0.0
|
0.3
|
0.0
|
0.6
|
Long ITI duration
|
|
|
|
|
|
|
|
|
|
|
Context repetition
|
40
|
34
|
32
|
32
|
31
|
2.4
|
1.8
|
0.3
|
0.3
|
0.9
|
Context change
|
34
|
29
|
37
|
23
|
12
|
3.1
|
-1.3
|
-0.4
|
0.3
|
1.0
|
Context-transition effect (c-CSE)
|
6
|
5
|
-5
|
9
|
20
|
-0.8
|
3.1
|
0.7
|
0.0
|
-0.1
|
[1] For Experiment 2-5, we increased the sample size in batches of 30 participants and tested our main hypothesis under a Bayesian framework. If a decisive Bayes factor (smaller than 1/6 or larger than 6) was observed, we would stop data collection, elsewise we would continue. In Experiment 5, we would start with a minimum sample size of 100 participants to avoid accumulation of misleading evidence in smaller minimum sample sizes as suggested by Schönbrodt et al. (2016). Please note, that we used the preregistered approach: The stopping rule was based on a Bayesian t-test testing the difference in c-CSE between the short and the long ITI condition for the preregistration. However, in the results sections, we report the results of a Bayesian ANOVA (see Analysis and Results). The Bayesian ANOVA model uses a different approach to calculate prior distributions than the Bayesian t-test (see Rouder et al. (2012); Rouder et al. (2009)). Therefore, Bayes Factors resulting from the Bayesian ANOVA differ (i.e., turned out to be more conservative) from the Bayes Factors that were calculated with Bayesian t-tests as the criterion for the stopping rule.
[2]The preregistration specified a frequentist, repeated-measures ANOVA to test our hypotheses. However, to allow null-model testing in a consistent analysis plan and to avoid violations of NHST assumptions by repeated testing in the mega analysis, we decided to switch for all analyses to a Bayesian framework using Bayesian ANOVAs (R-package ´BayesFactor´; Morey et al. 2015). The model adhered closely to the originally preregistered analysis plan, using the same factors and dependent variables as well as including participant as random effect. The results of the preregistered analysis plan can be found in the online supplement.