Observers
Sixteen observers (10 females, six males, aged 22–34 years), including one author (A.D.) have participated in the experiment. The number of observers needed was determined by a power analysis using G*Power software (Faul et al., 2007). The effect size was set to ηG2 = 0.14, which was reported in a previous paper with the same experimental protocol, and comparable number of manipulations (Fernández et al., 2019), and the sample size was evaluated at 80% power for the potential interaction between temporal attention and expectation.
All observers provided written informed consent and had normal or corrected- to-normal vision. All experimental procedures were in agreement with the Helsinki declaration and approved by the New York University Institutional Review Board.
Apparatus
Stimuli were generated using an Apple iMac (3.06 GHz, Intel Core 2 Duo) and MATLAB 2012b (Mathworks, Natick, MA, USA) along with the Psychophysics Toolbox (Brainard, 1997; Kleiner et al., 2007), and presented on a CRT monitor that was color-calibrated (1280 × 960 screen resolution, 100-Hertz refresh rate). Observers were seated 57 cm from the display with their head movements limited by a chinrest. The Eyelink 1000 eye tracker (SR Research, Ottawa, Ontario, Canada) was employed to record eye position and perform online eye tracking to maintain central fixation throughout the trials. If a fixation break due to a blink occurred or if the eye position deviated more than 1° from the center of the screen between the ready cue and the response cue, the trial would be aborted and subsequently repeated at the end of each experimental block. Observers could blink or move eyes after the response cue and during the intertrial interval.
Stimuli
Stimuli were displayed on a uniform medium gray background. A fixation circle (subtending 0.15° dva) was presented at the center of the screen. The placeholders were four small black circles (0.2°) placed at corners of an imaginary square (side length = 2.2°) centered at the screen center.
Target stimuli were Gaussian-windowed (standard deviation of 0.3°) sinusoidal gratings (spatial frequency = 4 cpd) with random phase presented at full contrast. Each target was tilted clockwise (CW) or counterclockwise (CCW) from the vertical or horizontal axis. The tilt was titrated for each observer and for each target interval independently.
Auditory stimuli were presented through the speakers. The attentional precue was a 200-ms auditory tone, either a sinusoidal wave, or a complex waveform that is a combination of sinusoidal waves with frequencies ranging from 50–400 Hz. A high-frequency sinusoidal tone (800 Hz) indicated the first target, a low-frequency sinusoidal tone (440 Hz) indicated the second target, and the complex tone was uninformative regarding the target (neutral trials). The same tones (except for the neutral tone) were used as the response cue at the end of the trial to indicate the first (T1) or the second target (T2).
Experimental Procedure
The experimental protocol was adapted from previous endogenous temporal attention literature (Denison et al., 2017, 2021; Fernández et al., 2019) and expectation literature (Todorovic et al., 2015). In a two alternative forced-choice task, observers were asked to report the orientation of one of the two Gabor stimuli. Throughout the experiment, the two targets were presented sequentially at the center of the screen, while the observers were fixating at that location.
Figure 2.B illustrates the experimental protocol. At the beginning of each trial, an auditory precue was presented to indicate whether to attend to the first target (T1), second target (T2) or both targets. Each target was presented for 30 ms with a stimulus onset asynchrony (SOA) of 250 ms. The onset of the targets relative to the precue was variable throughout the experiment, and was controlled throughout each session to manipulate temporal precision. An auditory 200-ms response cue was presented 500 ms after the second target onset. The observers were allowed to respond after the go cue, when the fixation brightness changed 800 ms after the response cue onset. In neutral trials, the response cue indicated first or the second Gabor with 50% probability, and in the other trials, precue was 100% valid and the response cue always matched the precue. Feedback was presented for 500 ms after each trial (A red minus after an incorrect response, or a green plus after a correct response). The observers had no time limit to respond, and were instructed to prioritize accuracy over speed when responding. The intertrial interval was jittered between 1700 and 1100 ms.
Eye-tracking was used throughout the experiment, except during the practice session. Participants were instructed to avoid making eye movements and to focus on the center of the screen. If a participant looked away from the center of the screen or blinked during a trial, the trial was skipped and presented again at the end of the block. However, participants were free to make eye movements and blink after the response cue and before the pre-cue of the subsequent trial, defined as their response window.
Temporal precision within an experimental session was manipulated by changing the variability of the stimulus onsets relative to the precue, adapted from a previous neuroimaging study investigating the interaction between temporal expectation and temporal attention (Todorovic et al., 2015; Todorovic & Auksztulewicz, 2021). Figure 2.A illustrates the temporal distributions that created uncertainty–different levels of temporal precision. There were four temporal precision conditions in our design: certain, narrow, wide, and uniform. We tested visual performance at the same levels of stimulus probability as in the literature (42% and 86%), and added two additional temporal precision levels (33% and 100%), In all conditions, the probability distributions of T1 and T2 onsets were centered at 1400 and 1650 ms respectively after the precue onset (expected moments). And the probability of T1 appearing at 1400 ms, and T2 at 1650 ms after the precue onset was either 100% (certain), 86% (narrow), 42% (wide), and 33% (uniform, lowest precision) respectively, which determined the level of temporal precision. For each precision condition, we collected data for 1, 2, 3, or 4 sessions for certain, narrow, wide and uniform conditions respectively, to be able to have enough data points at the mean point (expected moment) of the temporal distribution. The temporal distributions were explained to the observers prior to each experimental session.
Except for the 100% certain condition, there was a time window (1200-1600 ms for T1, and 1450-1850 ms for T2) when the stimuli could occur, which enabled us to test the performance at earlier and later than the expected moment (midpoint of the temporal distributions, 1400 ms for T1 and 1650 ms for T2). In wide (42%) and uniform (33%) conditions, we had enough trials at the early and late time points that were comparable to the number of trials at the expected moment.
The order of the trials was randomized across the sessions, and the order of the sessions for temporal precision was randomized across participants. Each observer completed 7-10 sessions and 3568-4256 trials in total. Before each session, we titrated the neutral performance at the expected time points (1400 ms for T1 and 1650 ms for T2 after the precue) to be at 75% for both targets independently. Each experimental session started with the tilt threshold determined by the best PEST procedure (Lieberman & Pentland, 1982), and was adjusted on a block-by-block basis if the neutral accuracy significantly differed from the aimed baseline of 75% accuracy.
Statistical analysis
Data analyses were performed with R (version 4.2.3; R Core Team, 2023), with ANOVA conducted using the ezANOVA package (Lawrence, 2016).
The discriminability index, d’ was computed by: z(hit rate) – z(false alarm rate). Correct discrimination of clockwise trials were categorized as hits and incorrect discrimination of counter-clockwise trials were categorized as false-alarms (Fernández & Carrasco, 2020; Zhang et al., 2019). We implemented a correction to avoid infinite values when computing d′, and added 0.5 to the number of hits, misses, correct rejections, and false alarms before computing d′ (Brown & White, 2005; Hautus, 1995).
We use median values for RTs (from the go-cue) throughout the analysis as the participants had infinite time to respond, causing the RT distribution to be skewed, which makes the median a less biased estimator for RT.
To capture the effects on accuracy and the speed of responses with a single metric, we also calculated the Balanced Integration Scores (BIS):
BISi,j = ZACCi,j - ZRTi,j
Here, BISi,j represents the difference between standardized mean correct accuracies and median RTs, affording equal importance to each. The subscripts, i and j, correspond to participant i's performance for condition j (Liesefeld & Janczyk, 2019).
To provide a better summary of the attentional modulations across precision and expectation, we computed a benefit index using the BIS scores (adapted from Denison et al., 2017) :
Benefit Index = (Valid T1 - Neutral T1) + (Valid T2 - Neutral T2)