Participants
We recruited 36 participants from the Universitat de València with ages between 18 to 32 years old (mean = 22.84, SD = 3.90; 14 females) and normal or corrected-to-normal vision. All participants were informed that data would be collected anonymously during the session and signed an informed consent form before starting. All experimental protocols were approved by the Ethics Board of the Universidad Nebrija (Comité de Ética en Investigación – CEI), and the studies were carried out in accordance with the general guidelines of the Ethics Committee.
Materials and Procedure
2D Simon task
All stimuli were presented on a 15.6" laptop screen from a distance of about 55cm. The stimuli were one blue and one red colour solid square displayed on a white background. Each trial began with a central fixation cross (+) for 500ms. Participants were asked to press the "F" key if they saw a red square and the "J" Key if they saw a blue square. They were asked to respond as fast as possible, avoiding making mistakes. A maximum response time of 3000ms was accepted.
All trials pertained to three critical experimental conditions (congruent, incongruent, and neutral). Trials were defined as congruent when the target was displayed on the same side as the correct response key (e.g., a red square positioned on the left side of the screen). Incongruent trials were those on which the target location and the correct response were on opposite sides (e.g., a red squared displayed on the right side of the screen). Finally, when the target was displayed centred to the fixation cross, these trials were defined as neutral (see Fig. 1 for illustration of the stimulus type employed in our task).
The 2D experimental task was built using Gorilla Experiment Builder[50] and executed within the same web-based platform (www.gorilla.sc).
Table 1. Mean accuracy and latency results per condition, modality, and task. The standard deviation is presented in parentheses.
Task
|
Stimulus Type
|
Accuracy
|
RT (in ms)
|
Simon 2D
|
Congruent
|
0.96 (.04)
|
420 (75.53)
|
Incongruent
|
0.89 (.08)
|
462 (87.68)
|
Neutral
|
0.95 (.04)
|
419 (67.53)
|
Simon 3D
|
Incongruent
|
0.98 (.02)
|
641 (110.06)
|
Congruent
|
0.91 (.06)
|
679 (100.75)
|
Neutral
|
0.96 (.03)
|
636 (102.64)
|
Flanker 2D
|
Congruent
|
0.98 (.02)
|
441 (71.90)
|
Incongruent
|
0.90 (.14)
|
515 (123.58)
|
Neutral
|
0.98 (.02)
|
441 (69.60)
|
Flanker 3D
|
Congruent
|
0.98 (.02)
|
372 (82.40)
|
Incongruent
|
0.94 (.06)
|
422 (89.23)
|
Neutral
|
0.98 (.03)
|
450 (95.96)
|
Simon 3D task
The chosen stimuli consisted of three male human avatars that were programmed to perform two different actions: clapping hands and raising a hand. Each action was displayed within a virtual reality environment resembling a classroom scenario. One of the avatars was positioned in front of the participant's point of view and centred within the scenario, and the other two were placed located at his left and right sides within the field of view. All three avatars were displayed simultaneously and programmed to remain in an idle position. Each trial began with a neutral auditory stimulus during 500ms that alerted participants of the beginning of a new trial. After 500ms, one of the avatars would perform one of the programmed movements. As in the 2D version, a maximum response time of 3000ms was permitted (see Fig. 2). A video of the virtual environment and the task is provided at https://doi.org/10.6084/m9.figshare.19984631.
Within the virtual environment, each controller became a virtual hand; thus, participants held each virtual hand with the corresponding real hand and were asked to give quick and accurate responses by pulling the trigger in each controller to indicate if the avatar performing a movement was either clapping (left trigger) or raising a hand (right trigger). Similar to the 2D version of the task, all trials pertained to one out of three critical experimental conditions (congruent, incongruent, and neutral). Trials were defined as congruent when the moving avatars' action was performed on the same side as the correct response trigger (e.g., leftmost avatar clapping). Similarly, incongruent trials were defined as those in which the critical avatar's location and the correct response trigger were on opposite sides (e.g., the rightmost avatar clapping). Finally, when the moving avatar was the one centred to the participants' point of view, these trials were defined as neutral.
The 3D experimental task was created using Vizard 6.0 (WorldViz), a Python-based software (Python v. 2.7.12). The experiment script was executed on a high-end gaming laptop (MSI GL76) computer equipped with an Intel Core i7-10750H (2.6 Hz), running Windows 10 operating system (64 bit), 32 GB RAM, and an NVIDIA GeForce RTX 2070 video card. To ensure and maintain high-performance connections between the PC and the VR HMDs, battery-saving settings were disabled. 3D stimuli were presented through the HTC Vive Pro HMD (HTC Vive Pro, 2018) at 2880 × 1600-pixel resolution (1440 × 1600 per eye) and 90-Hz refresh rate. Thus, providing 110 degrees of field of view and high immersive experience, made of a high-quality display and a stable tracking system[51].
Regardless of the task version (2D or 3D), all began with a practice period in order to familiarise participants with the task. This practice included 12 trials, four from each condition. After the practice, the experimental trials followed, including 48 trials per condition. Experimental trials were distributed across three blocks. Each block included 16 trials per condition that were randomly presented. Overall, the 2D task was completed in around 5 minutes and the 3D task in about 8 minutes. Between the two versions of the task a 15-minute distracting task was presented. The presentation of the two task modalities was counterbalanced across participants.
Results
Collected data were processed and cleaned in RStudio[52] and analysed with JASP[53]. Descriptive analyses were undertaken to ascertain reaction times and accuracy (see Table 1). Mean reaction times (RT) were computed for each condition and participant at a trial level by including only accurate responses. Additionally, participants' RT that were 2.5 SD faster or slower than the mean RT per condition or those associated with timed-out responses were rejected (2.94% of the data in the 2D modality and 1.86% of the data in the 3D version).
We carried out a 3 (Stimulus Type: congruent, incongruent, and neutral) x 2 (Task Modality: 2D and 3D) repeated-measures ANOVA on the RT data. Significant main effect of Stimulus Type was found (F(2, 70) = 62.016, p < 0.001, ηp2= 0.639). Post hoc analyses revealed that differences occurred between congruent and incongruent conditions (MDiff = -39.952, SE = 4.326, pbonf < 0.001), reflecting the classical Simon interference effect, and between incongruent and neutral conditions (MDiff = 43.296, SE = 4.326, pbonf < 0.001), showing an incongruency effect. No significant difference was found between congruent and neutral conditions (MDiff = 3.343, SE = .773, pbonf = 1.000). Additionally, the main effect of Task Modality was also significant (F(1, 35) = 189.223, p < 0.001, ηp2= 0.844), being latencies from the 3D modality larger than the 2D (MDiff = 218.470). Importantly, there was not an interaction between Stimulus Type and Task Modality (F(2, 35) = 0.271, p = 0.277, ηp2 = 0.008) (see Fig. 3).
A similar repeated-measures ANOVA was performed on the accuracy scores from both tasks. When sphericity assumptions were violated, the Greenhouse-Geisser correction was applied. A significant main effect of Stimulus Type was found (F(1.652, 70) = 48.783, p < 0.001, ηp2 = .582). Post hoc analyses showed significant differences between congruent and incongruent conditions (MDiff = 0.072, SE = 0.006, pbonf < 0.001; namely, a Simon effect), and between incongruent and neutral conditions (MDiff = -0.056, SE = .006, pbonf < 0.001; namely, an incongruence effect). No significant difference was found between congruent and neutral conditions (MDiff = 0.016, SE = 0.008, pbonf = 0.110). The main effect of Task Modality was also significant (F(1, 35) = 6.535, p = 0.015, ηp2= 0.157), being responses on the 3D modality more accurate than in the 2D modality (MDiff = 0.016). Finally, there was not an interaction between Stimulus Type and Task Modality (F(1.458, 35) = 0.225, p = 0.728, ηp2 = 0.006) (see Fig. 3).
Discussion
The 2D and 3D versions of the Simon task showed a markedly similar response and accuracy pattern across all stimulus type conditions. In both settings, incongruent stimuli elicited longer response latencies compared to congruent and neutral stimuli, and classic Simon effects and incongruency effects were replicated both in the 2D and 3D versions of the paradigm. RT significantly differed between task modalities, with participants showing shorter response latencies when the task was performed in the 2D context, an effect that can be easily explained by core differences in the stimuli presentation. Whereas in the 2D version of the task the stimuli were displayed without a movement component, in the 3D version participants had to hold their response until the avatar movement was evident. This interpretation aligns with the small difference between task modalities found in terms of accuracy (which should not be influenced by the time needed by the characters to initiate a movement), and with the similar pattern of effects found across modalities occur in all task conditions. As a conclusion, our results showed that both the 2D and 3D tasks are equally capable of capturing participants' inhibitory control towards prepotent responses as measured by the Simon interference effect.