There are numerous situations in which people, sometimes unknowingly, find themselves producing voluntary actions. Those actions may be driven by long-range intentions, for example, studying hard from the beginning of a semester in preparation for final exams, or/and short-range intentions, such as immediate feelings to act, such as adjusting sitting position (Bratman, 1987; Mulcahy & Call, 2006; Schacter et al., 2007; Pacherie, 2008). In fact, short-range intentions to perform action are highly dependent on the extent to which long-range intentions have formed (Haggard 2008; Pacherie & Haggard, 2010). The relation between long-range intentions and voluntary action has been a pivotal focus regrading ‘conscious intention’ in neuroscience. Conscious intention, in its most fundamental sense, is a change in behaviour brought about by experience. More importantly, both action and conscious intention are driven by common brain activity, namely neural preparation for action, which could explain the correlation between them (Wegner, 2003; Haggard, 2008; Lebreton et al., 2019). Voluntary actions have been investigated as one of the major human action in the field of cognitive neuroscience, with accumulating evidence in support of its distinct functional role (Haggard, 2008; 2018; Passingham et al., 2010).
Probably the earliest reported neural marker of voluntary action is bereitschaftspotential or “readiness potential” (RP). As described by Kornhuber and Deecke (1965), RP refers to a gradual buildup of electrical potential prior to voluntary action. The RP is regarded as “the electro-physiological sign of planning, preparation, and initiation of volitional acts” (Kornhuber & Deecke, 2003); thus, inviting the assumption that the link between the gradual increase in electrical potential and consequent voluntary action reflects a pre-process of establishing intentions to act. Notably, Libet (1982, 1983) attempted to access the neural basis of intention on Wundt’s classical technique, demonstrating that the individual’s neural activity precedes the intent to move by a few hundred milliseconds. Schurger et al. (2012) employed a modified Libet task, in which participants were additionally told to press the button immediately when they heard a brief click as an interruption. Faster responses to these temporally unpredictable cues (the click) are preceded by a gradual negative-going electrical potential over the central area prior to the interruption itself. No similar convergence in slower responses has occurred. The results propose that the neural activities during the RP period reflects a spontaneous subthreshold fluctuation, that would trigger an action onset when it reaches a threshold, suggesting instead that RP is an internal signal of volition in the Libet interpretation (Libet et al., 1983). However, the action in these studies were mainly driven by short-range intentions within a single trial, meaning that the findings may not represent more complicated real-world voluntary action (Churchland, 1981; Khalighinejad et al., 2018; Maoz et al., 2015; Nachev & Hacker, 2014).
Recent studies have adopted paradigms with more specific intentional goals to better resemble voluntary action. Khalighinejad et al. (2018; 2019) adapted a ‘wait-or-skip’ approach previously employed with nonhumans (Murakami et al., 2014), in which the participants could choose to wait or ‘skip’ for big or small rewards. The skip action was regarded as a good example of real-world intentional action, which includes both short-range intentions (the ‘urges’ of being about to act) and relatively long-range intentions about the expected reward of the whole experiment (i.e., the consequence of series actions). Results showed that inter-trial EEG variability decreased more remarkedly prior to voluntary skip action, compared to a control condition when the skip action was produced according to an external cue. These results suggest the reduction in neural noise as a promising neural signature for explaining voluntary action. Moreover, “a consistent process of volition” is detectable during both the early planning stage and the late execution-related stage of action (Khalighinejad et al., 2018; 2019).
A more recent study developed a reinforcement learning paradigm in which the participants learned through trial-and-error about the optimal time to bake a soufflé (Travers et al. 2021). The voluntary action in this study was the key press action to withdraw the soufflé from the oven when it was ready, and the participants were rewarded for the correct timing of their action. This study broadened the research of voluntary action, formed short-range intention (i.e., estimating the baking time) and long-range intention (i.e., action rests on previous action and aims to the final reward) for action at the same time. The results suggested that RP reflects motor planning for upcoming behaviors, rather than freedom from external restrictions.
Although these studies included the factor of long-range intentions, they were not designed to address long-range intentions; therefore, necessary control conditions were lacking to understand the neural signatures specifically associated with long-range intentions. Furthermore, these paradigms have been haunted by traditional stereotyped features, such as having highly repeatable trials, and resting on a single decision point rather than a continuous readout of action, which weakened their ecological validity for representing real-world voluntary action (Miller et al., 2022). To directly address the issue of long-range intentions, we employed a paradigm derived from the schedule of reinforcement studies was adopted (see Ferster & Skinner, 1957). A free-operant reinforcement schedule enables subjects to perform more elaborate voluntary action without any direct trial-based cues.
The current paradigm was adapted from a recently revised version that used random ratio (RR), yoked random interval (RI) reinforcement schedules, for forming intention-based action (Bradshaw & Reed., 2012; Chen et al., 2020; 2021). The RI schedule is presented after the RR schedule, and the reinforcement in the RI block is yoked to the reward on the RR block (see Methods for details). The participants learned to respond at a series of optimal time points on the RI block for the highest points, based on their experiences in the preceding RR block. After the participants were familiar with the RR-yoked-RI schedule, the action released during the RI blocks were voluntary action without any direct single-action-based cues, and namely time-based. Time-based actions have been suggested to be more endogenous than event-based actions, as previous approaches have shown, which significantly reduce endogenous action to externally-triggered ones (Okuda et al., 2007; Pacherie & Haggard, 2010). Thus, actions during the current task are beyond a single trial, and are associated with long-range intentions through a continuous behavioural process: each response in the RI develops from previous response by immediate feedback, is highly impacted by experiences during the RR, and the current contextual information rather than an isolated decision. This procedure allows participants to mentally retrospect previous events and apply the information to the current related events, and the process is namely mental time travel, which is believed as a prime candidate for long-range intentions (Tulving, 1983, 2005; Dudai & Carruthers, 2005; Suddendorf & Corballis, 2007; Pacherie & Haggard, 2010). Thus, the reward rates of participants in RI offer an indicator of the extent to which long-range intentions have been formed. Furthermore, the reinforcement learning paradigm allows behaviour to be modulated by previous experience, having greater flexibility than instinctual modules of behavior, which provides the foundation for reasoning and allows learning in one task to be endogenously transferred to another (Suddendorf & Corballis, 2007; Pacherie & Haggard, 2010).
Based on previous studies on the neural signatures of voluntary action, the present study focused on both the averaged EEG amplitude prior to action onset and the inter-trial variability (Kornhuber & Deecke,1965; Dilek et al., 2022). If the current paradigm is validated, RI-based action (voluntary action) will release a greater plane of conscious intentions than action in a yoked condition with an external-trigger. A difference between EEG variability preceding voluntary action, with variability preceding externally-triggered action at a similar time (see Method), would be predicted, and which would state the crucial neural signatures of voluntary action. Importantly, the possible pattern of EEG amplitudes and/or variability by learning in the present task could further elucidate the potential functional involvement of these signatures, as the nature of the schedule of reinforcement (Tolman & Honzik, 1930; Glimcher, 2005; Gershman, 2019), that is, the comparison between higher and lower level of performance RI blocks (i.e., higher and lower rewards rates blocks) would reveal how these signatures were related to long-range intentions.