Thirty right-handed healthy subjects (ten females) aged 18 to 35 years (mean age = 27 years) participated in the study. The study was conducted in the facility of the Central Institute of Psychiatry, a tertiary care Psychiatric hospital in eastern India. Subjects were recruited by word of mouth among the hospital staff. All the subjects were recruited after excluding any neurological or psychiatric illness and hearing impairment. The study was approved by the on-site institute ethics committee of the Central Institute of Psychiatry. All the subjects signed an informed consent form before the enrolment.
Procedure
The study participants engaged in a talk-listen paradigm facilitated by E-prime3 (Psychology Software Tools, Inc.) software for the stimulus presentation. During the talk condition, participants were instructed to emit brief (<300 ms), sharp vocalizations of the phoneme 'Ah' at a self-paced interval of approximately every 2-3 seconds for a duration of 180 seconds. The training ensured that participants maintained 'Ah' vocalizations at a consistent sound intensity of 75 to 85 dB, as measured by a handheld decibel meter (model H-M80A) positioned approximately 6 cm from the participants' mouths. This sound intensity was standardized across both talk and listen conditions by equilibrating earphone audio output using the hand-held decibel meter to mitigate variability. Participants were also cautioned against any overt facial or bodily movements that could introduce artifacts into the EEG recordings.
The vocalizations were captured using a microphone (model NT1) connected to the computer managing the stimulus presentation and were immediately played back to the subjects in real-time via audio-technica earphones equipped with foam insert tips and active noise cancellation features (model- ATH-ANC33IS), which were inserted into the ear canals. Following the talk condition, the listen condition commenced, wherein the previously recorded vocalizations were played back passively to the subjects. Throughout both conditions, participants were instructed to maintain their focus on a centrally positioned white cross, displayed against a dark background on a 14-inch monitor.
Offline processing of the audio track:
The recorded vocalizations were digitized and subsequently processed offline utilizing an automated Matlab-based script designed to detect the onset of vocalizations (Ford et al., 2010). The timestamps marking the onset of vocalizations, as identified by this automated routine, underwent further verification using Audacity®, an open-source audio editing software. Erroneous timestamps were meticulously corrected or removed, culminating in the creation of the final event file. This file was then integrated into the EEG recording for subsequent time-locked epoching and averaging of EEG data.
Data Acquisition and Processing:
Electroencephalogram (EEG) data were captured using a 64-channel cap, incorporating Ag-AgCl electrodes, adhering to the 10-10 system international standard, and interfaced with a Galileo EB Neuro system. Impedance was kept below 5 kΩ. Additional bipolar electrodes were strategically positioned at the outer canthi of both eyes to record the horizontal electrooculogram (HEOG) and above and below the right eye to capture the vertical electrooculogram (VEOG), facilitating the monitoring of eye movements. The EEG data were recorded with an online reference between the Fz and AFz electrodes at a sampling rate of 1024 Hz. The data were imported offline into the EEGLAB (Delorme & Makeig, 2004) and a 1-15 Hz band-pass, Butterworth filter of order 2 was applied before further processing. Subsequent data processing involved an independent component analysis using an extended infomax-based algorithm to segregate the data into statistically independent components. Artefactual components were identified using the Multiple Artefact Rejection Algorithm (MARA) plugin, which is freely available in EEGLAB (Winkler et al., 2011) and subsequently removed. Each data was subsequently offline re-referenced to LM (the average of mastoid electrodes TP9, TP10), CAR (the average of all the channels), and to the REST reference, creating three distinct data processing streams. The REST referencing was done using the freely available plugin available in EEGLAB (Dong et al., 2017). Each data was further segmented into 600m s epochs with a 100ms pre-stimulus interval. Baseline correction was applied to the pre-stimulus baseline of 100ms in all the epochs. All the epochs were screened for amplitude fluctuation, and those exceeding beyond the range of -/+50 microvolt were rejected.
Data Analysis
ERP Analysis:
The N1 peak amplitude was determined by identifying the most negative amplitude relative to the pre-stimulus baseline within a temporal window of 60-150 ms following the vocalization of 'Ah' in both the talk and listen conditions. The N1 amplitude were assessed individually for data referenced to LM, CAR, and REST. The grand average amplitude scalp topography displayed the predominant amplitude negativity centred around the frontocentral area across all reference schemas (as illustrated in Figure 2). Consequently, our analysis concentrated on the amplitude data from this specific region. The electrode locations at Cz and FCz have been consistently utilized as target sites for sampling data in prior investigations examining CD in various neuropsychiatric conditions. For statistical analysis, the N1 Talk and N1 Listen data were computed as the average N1 amplitude from the Cz and FCz channels for each respective condition. Furthermore, the degree of N1 suppression, quantified by the corollary discharge index (CDI), was determined by calculating the difference between the N1 amplitudes during the talk and listen conditions and [CDI = N1Talk - N1Listen]. The resultant CDI values used for statistical comparisons were derived by averaging the CDIs calculated from the Cz and FCz channels.
Statistical analysis:
The primary objective of this investigation was to evaluate the impact of three distinct reference schemes on event-related potential (ERP) data, focusing on identifying the most effective reference scheme for detecting speech-generated corollary discharge during the talking condition. A two-way repeated measures analysis of variance (ANOVA) was employed to assess the influence of the study condition (Talk and Listen) and the reference scheme (LM, REST, and CAR) on the auditory N1 amplitude. The threshold for rejecting the null hypothesis was set at p < 0.05. Subsequently, targeted analyses involved conducting separate paired t-tests between conditions for each offline reference scheme to determine if variations in N1 amplitude persisted irrespective of the reference scheme utilized. We did a separate repeated measures ANOVA to compare the CDI values across the reference schemas, to understand the impact of the schema on this frequently employed measure of corollary discharge induced N1 suppression.
Further analytical measures included a correlational analysis to explore the impact of the selected offline reference scheme on CDI metrics: CDI-LM, CDI-REST, and CDI-CAR. All statistical analyses were performed using IBM SPSS Statistics 20 (SPSS Inc., Chicago, Illinois 60606).
SPSM and Scalp topography:
To obtain a more comprehensive appreciation of the distribution of scalp potential differences across conditions, we further conducted a topographical analysis of scalp potentials by implementing statistical parametric scalp mapping (SPSM) – a statistical technique for elucidating the scalp distribution of significant differences between the condition-specific N1 ERPs. To this end, we performed paired t-tests for the N1 amplitude between the Talk and Listen conditions across all the scalp electrodes and represented all the significant result t-values as a scalp topographic map. The results were marked significant at a p-level of < 0.01 after FDR correction to adjust for the multiple comparisons. This analysis was performed separately for each of the reference schema- CAR, REST, and LM.
Source Analysis:
To further validate the SPSM results, we conducted three sequential analysis steps using the Brainstorm application (Tadel et al., 2019). Firstly, a 3-shell sphere head model (conductivities of the Brain, Scalp, and Skull are 0.33, 0.0042, 0.33, respectively) of each participant was estimated, with a total of 15,002 sources (voxels) distributed around the cerebral cortex in the ICBM 152 head model (a non-linear average of 152 magnetic resonance scans) and the default channel location file in the brainstorm application (Fonov et al., 2011). The parameter of Regularize noise covariance was set to 0.1, and the parameter of signal-to-noise ratio was set to 3.00. Secondly, using these parameters, a source activity reconstruction was performed for the grandaveraged ERPs, estimated with the LM schema, for the Talk and Listen conditions. Subsequently, a 20 ms time range around the N1 peak in the source space was intercepted and averaged for each condition separately. Finally, the averaged source activity differences were projected back to the sensor space to generate the topographic map. The resultant topographic map reflects the source difference activity between the conditions. The source reconstruction and sensor projection were performed by brainstorm in the MATLAB environment.
Noise covariance:
The noise covariance required for the source reconstruction was estimated based on the resting state EEG data. A 150-second resting state eyes open EEG was recorded for each individual after the completion of the task for the computation of the noise covariance matrix. The dataset was processed in the similar manner and a 1-15 Hz band pass filter was applied. Subsequently all the filtered datasets were concatenated to generate a single dataset to compute the covariance matrix.