Participants
We conducted total sample size estimation by G*Power to determine the number of samples sufficient to detect a reliable effect. According to partial eta square values of previous reward-CPS studies (Cui et al., 2021; Cristofori et al., 2018), we calculate the effect size f are 0.47 and 0.52. Consequently, we adopted an effect size of f = 0.4, as suggested by Cohen (2013), 20 participants were needed to detect a significant effect (α = 0.05, power (1-β) = 0.9, ANOVA: repeated measures, 2 × 2 within factors, G-Power 3.1.9.2) (Faul et al., 2009).
Twenty-five participants aged 18 to 23 years (M = 20.74, SD =1.51; women: 22) participated in this study. All participants had normal or corrected-to-normal vision, were unaware of the study's aims, and were right-handed native Chinese speakers with no reported neurological disorders. The study received approval from the local ethics committee. Upon completion, participants were thoroughly informed about the study's objectives and procedures. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Participants received 50¥as payment for their participation, with additional monetary rewards paid (up to 70¥) depending on their performance.
Design and Procedure
The stimuli were presented on a CRT monitor with a resolution of 1920×1080 and a refresh rate of 85 Hz. The experiment was programmed using E-Prime 3.0 (Psychology Software Tools, Inc., Pittsburgh, PA, USA) for both stimulus presentation and response recording. The study employed a within-participants design with a 2 (reward type: real or hypothetical) × 2 (reward level: high or low) factorial structure. Participants were provided with detailed instructions regarding the reward conditions, the Chinese CRA task, and the concept of insight. They then completed five practice trials to familiarize themselves with the task.
As illustrated in Figure 1, each trial began with a central fixation cross displayed for 0.5 seconds, followed by the presentation of a reward value (1¥/0.1¥, presented in half of the trials) for 1 second. Participants were subsequently required to solve Chinese CRA problems (Cui et al., 2021; Du et al., 2017). Each problem in the Chinese CRA task consisted of three stimulus words (e.g., “xing/liu/li”, 行, 流, 里) presented together. Participants had to generate a solution word (e.g., “cheng”, 程) that could combine with each of the three given words to form a familiar two-word phrase (i.e., “xing cheng”, 行程, “liu cheng”, 流程, “li cheng”, 里程) within a 10-second time limit. They then indicated whether their solution was reached through insight or analysis (Jung-Beeman et al., 2004; Salvi et al., 2015) within 3 seconds. Successful problem solving was followed by reward feedback. In the hypothetical reward condition, participants were informed that the gains were virtual and were instructed to imagine them as real money, striving to maximize their virtual profits. Conversely, in the real reward condition, participants were informed that the rewards earned during the experiment were tangible, with the assurance that their compensation would be directly proportional to the actual monetary gains they accumulated throughout the study. The experiment consisted of two blocks: real and hypothetical rewards. Within each block, all items were pseudorandomly assigned to two reward levels. The entire experiment comprised 240 trials. All trials were presented randomly to the participants, and after the participants completed 60 trials, they could rest. The procedures of the experiment are shown in Fig. 1.
EEG recording and analysis
EEG data were collected during the Chinese CD task using the Neuroscan Synamps2 EEG recording and analysis system. EEGs were recorded using 64 Ag/AgCl electrodes in an elastic cap using the International Standard 10–20 system. Vertical and horizontal EEGs were recorded during data acquisition using the Neuroscan electrode cap and with its own reference electrode as the online reference electrode. EEG data were sampled at 1000 Hz/channel, electrode impedances were kept lower than 10 kΩ, and the recording bandwidth ranged from 0.05 to 100 Hz.
Off-line analyses were performed in MATLAB using the EEGLAB toolbox (Delorme & Makeig, 2004) and ERPLAB toolbox (Lopez-Calderon & Luck, 2014). The EEG signals were referenced to the average of bilateral mastoid electrodes and filtered using IIR-Butterworth filters with half-power cutoffs at 0.1 Hz (roll-off = 12 dB/oct) with a high-pass filter and at 30 Hz (roll-off = 12 dB/oct) with a lowpass filter (Luck, 2014). Independent component analysis (ICA) was performed to correct the components associated with eve movement and eye-blink artifacts. Then, the artifact correction process was supplemented with artifact rejection to eliminate the trials with clearly artifactual voltage deflections. Specifically, trials were excluded if the peak-to-peak voltage within the EEG epoch was greater than 300 μV in any 200 ms window in any channel (Bacigalupo & Luck, 2018). Four participants were excluded for whom > 35% of trials were rejected because of EEG/EOG artifacts; therefore, 21 participants were included in the ERP/EEG analysis. EEG data were segmented into epochs. The problem-solving phase was segmented into epochs using a time window of 1400 ms, ranging from 200 ms before the stimulus to 1200 ms after the stimulus. For each subject, epochs belonging to the same reward condition were averaged, yielding four average waveforms time locked to the stimulus onset. Single-subject average waveforms of each reward condition were averaged across subjects to obtain group-level waveforms. Based on previous creative EEG studies (Cui et al., 2023; Qiu et al., 2008) and grand average waveforms, we focused on the P200-600 component and late positive component (LPC). To increase statistical strength and reduce false effects (Luck & Gaspelin, 2017), the F3, Fz, and F4 electrodes were collapsed by averaging their values as an indication of frontal activity; the FC3, FCz, FC4, C3, Cz, C4, CP3, CPz, CP4, P3, Pz, and P4 electrodes were collapsed by averaging their values as an indication of frontocentral, central, centroparietal and parietal activity, respectively. Three-factor repeated measures ANOVAs with 2 (reward: real, hypothetical) × 2 (level: high, low) × 5 (region: frontal, frontocentral, central, centroparietal, parietal) factors were used.
For behavioral measures and ERP components, we utilized JASP 16.1 software (Wagenmakers et al., 2018). In all analyses, we employed the Greenhouse–Geisser method to correct the p values of the F tests for deviations. The effects of ANOVAs were measured using partial eta squared, referred to as ηp2. For effect sizes in paired t tests, we employed Cohen's d, which calculates the mean difference score as the numerator and the pooled standard deviation from both repeated measures as the denominator (Cohen, 2013). To address multiple comparisons, we applied the Holm correction (Holm, 1979) in the present research.