Participants
Twenty-five autistic adults and 23 non-autistic adults participated in this study, they were right-handed and had no speech, hearing or neurological difficulties. The groups were comparable for sex (χ2(1) = .060, p = .807), age (t(46) = − .134, p = .894), and verbal (t(46) = .720, p = .475), performance (t(46) = − .875, p =. 386), and full-scale IQ (t(46) = .031, p = .975) as measured by four subtests of the Wechsler Adult Intelligence Scale (WAIS-IV UK; [31]; Matrix Reasoning; Block Design; Similarities; Vocabulary). The groups differed on Autism Spectrum Quotient (AQ; [32], t(46) = 11.879, p < 0.001. Although two autistic males were unable to complete the scan due to discomfort and health concerns, the scan groups remained comparable for sex (χ2(1) = .000, p = 1.000), age (t(44) = − .258, p = .798), and verbal (t(44) = .392, p = .697), performance (t(44) = -1.123, p =. 268), and full-scale IQ (t(44) = − .303, p = .763); and differed on AQ, t(44) = -12.325, p < 0.001. See Table 1.
The autistic participants had received an official diagnosis from a qualified clinician. Due to testing restrictions during the COVID-19 pandemic, we were unable to administer ADOS [33] to confirm their diagnosis. Nonetheless, the AQ was used in the pre-screening assessment, and non-autistic participants with an AQ score below the cut-off of 32 were included in the study. Non-autistic participants were recruited from the Autism@ICN and UCL SONA subject database. Autistic participants were recruited through advertisements placed at the Cambridge University Autism Research Centre (ARC) and various university disability services throughout the United Kingdom, in addition to the aforementioned subject databases. Informed written consent was obtained prior to testing, and the project received approval from the UCL research ethics committee.
Table 1
Participant Demographic Information
|
NA
|
Autism
|
Autism for scan
|
N (male: female)
|
23 (13:10)
|
25 (15:10)
|
23 (13:10)
|
Age (years)
|
28.348 (9.203)
|
28.720 (9.969)
|
29.087 (10.238)
|
Verbal IQ
|
121.435 (15.048)
|
117.800 (19.416)
|
119.435 (19.315)
|
Performance IQ
|
109.478 (15.985)
|
113.560 (16.289)
|
114.783 (16.059)
|
Full Scale IQ
|
119.000 (15.895)
|
118.840 (19.429)
|
120.565 (19.023)
|
AQ
|
13.652 (5.928)
|
35.560 (7.292)
|
37.348 (7.062)
|
Note. Values are given as mean (standard deviation). NA = non-autistic; AQ = autism-spectrum quotient. |
Experimental Design and Procedure
Word Stimuli. A subset of 300 words was selected from the original pool [34]. The words were carefully chosen to avoid floor and ceiling effects, as per the results of the baseline ratings task (Funniness: Mean = 3.309, SD = .586; Intelligibility: Mean = 92.959%, SD = 10.619%), and recorded by a professional male comedian in a comedy performance (Duration: Mean = .736 seconds, SD = .225 seconds; Root-mean-square: Mean = .031, SD = .000; Pitch: Mean = 189.677 Hz, SD = 82.154 Hz). Full details of word selection are given below.
To avoid the floor effect, we initially selected 719 words with a humour rating higher than 2.8 from a 5-point Likert scale (1 - humourless; 5 - humorous) from original pool [34]. Subsequently, a list of 621 words were audio recorded by a professional male comedian in a comedy performance following annotated by four native English speakers to ensure for appropriateness. The raw audio file was downsampled at a rate of 44100 Hz to mono.wav files with 32-bit resolution, and each word was trimmed and edited into a 1-second sound file (.wav) using version 2.3.3 of Audacity(R) recording and editing software. The files were then normalised for root-mean-square (RMS) amplitude using PRAAT [35]. We further conducted an online task to establish baseline ratings for the funniness of words. In this task, 621 words were assigned to three lists, each containing 207 words. The three lists were matched on the humour ratings from the original pool (List 1: M = 3.19, SD = 1.23; List 2: M = 3.20, SD = 1.23; List 3: M = 3.18, SD = 1.23). Fifty-eight native English speakers were randomly assigned to one of the three lists (list 1 n = 18; list 2 n = 19; list 3 n = 21). Participants were instructed to listen to the recordings of a comedian named Ben performing some funny words and were asked to rate the funniness of each word on a 7-point rating scale (“How funny was the word the way that Ben said it?” 1 - Not funny at all, 4 - Neutral, 7 - Extremely funny). Additionally, participants indicated whether they understand the meaning of each word. A practice trial was given before the actual task. The task was built and presented using Gorilla Experiment Builder [36].
Sound Stimuli. This study used 150 sound stimuli: 50 spontaneous laughter stimuli, 50 conversational laughter stimuli, and 50 noise-vocoded (NV) human vocalisations. The spontaneous and conversational laughter stimuli were recorded using a method previously validated in behavioural and neuroimaging experiments [16, 24] and were selected from a previous study, as detailed in [29]. We created NV stimuli by applying noise-vocoding to various human emotional vocalizations, including expressions of anger, pleasure, disgust, surprise, etc. The resulting NV stimuli lack emotional meaning and are not perceived as contagious by normal-hearing listeners. We normalised all stimuli for root-mean-square (RMS) amplitude using PRAAT [35], and extracted the acoustic parameters. One-way ANOVA indicated that spontaneous laughter, conversational laughter and NV sound stimuli were comparable in duration, root-mean-square (RMS), and intensity (Table 2). For the detailed comparation of acoustic properties between spontaneous and conversational laughter, see Table S1.
Table 2
Acoustic properties of sound stimuli set
Acoustic measure
|
Conditions
|
Mean
|
SD
|
F
|
|
p
|
Total duration (sec)
|
Spont
|
2.376
|
.406
|
1.297
|
|
.276
|
|
Conver
|
2.269
|
.361
|
|
|
NV
|
2.307
|
.216
|
|
Root-mean-square
|
Spont
|
.317
|
.000
|
.908
|
|
.406
|
|
Conver
|
.317
|
.000
|
|
|
NV
|
.317
|
.000
|
|
Intensity (dB)
|
Spont
|
64.000
|
.000
|
.945
|
|
.391
|
|
Conver
|
64.000
|
.000
|
|
|
NV
|
64.000
|
.000
|
|
Note. df = (2, 147). p values are given in two tailed. Spont = Spontaneous laughter. Conver = Conversational laughter. NV = Noised-vocoded human vocalization.
fMRI Experimental Design. An event-related paradigm was utilized, with each trial consisting of a jittered inter-trial interval (ITI) ranging from 2 to 4 seconds. In the sound stimuli conditions, a funny word was presented, followed by a sound stimulus from one of three conditions (Spont Laugh, Conver Laugh, NV) with a fixed duration inter-stimulus interval (ISI) of 0.09 seconds. The rest condition included a 2-second period of silence following the ITI. Vigilance trials involved a 0.5-second beep sound, requiring participants to press a button within 3 seconds. Each functional run, approximately 14 minutes long, comprised 25 trials per condition and five vigilance trials to assess attentiveness. The entire experiment consisted of four functional runs, each comprising 105 trials, during which participants passively listened to 300 words paired with sound stimuli. The sound stimuli were used twice each. Trial conditions were pseudorandomized to prevent more than two consecutive trials of the same condition involving words plus sound stimuli. Furthermore, neither rest nor vigilance conditions were the first trial of each run, nor were they presented consecutively. The pairs of words plus sound stimuli were pseudorandomized and counterbalanced across both runs and participants (see Fig. 1A).
Behavioural Experimental Design. Participants listened to the 200 word plus laughter pairs again and rated the funniness of each word. Due to time constraints imposed by COVID testing restrictions, the 100 word plus NV pairs were excluded. The pairs were the same as in the previous scan conditions, but the order of the pairs was shuffled. Participants rated each word on a 7-point rating scale (‘How funny was the word the way that Ben said it?’, 1 – Not funny at all, 4 - Neutral, 7 - Extremely funny). For each trial, participants had up to 6 seconds to give a rating. There was a short practice session before the real task to let participants become familiar with the structure of the task. The post-scan behavioural task lasted approximately 25 minutes (See Fig. 1B).
Procedure. Participants were informed that the fMRI study was about humour processing. Notably, any mention of laughter was intentionally avoided during the recruitment and testing. Before the scan, participants were instructed to listen to humorous words spoken by a comedian, and people's reactions. They were told to press a button on a button-box whenever they heard a "beep" sound. A practice sequence at the beginning ensured the volume was adequate and that participants could clearly hear the stimuli by recalling the words they heard. The testing lasted approximately 2 hours, split between a one-hour scan at the Birkbeck-UCL Centre for Neuroimaging (BUCNI) and a one-hour behavioural session at the UCL Institute of Cognitive Neuroscience, which encompassed the post-scan behavioural study, WAIS testing, and questionnaires. Both fMRI and behavioural experiments were presented using MATLAB [37] with the psychophysics toolbox [38].
Neuroimaging, Pre-processing and Analysis
Acquisition. We employed continuous event-related fMRI, acquiring blood-oxygen-level-dependent (BOLD) images using a Siemens Avanto 1.5-Tesla MRI scanner with a 32-channel head coil. The study involved four runs of 260 echo-planer whole-brain volumes (TR = 3s; TE = 50ms; TA = 86ms; Slice tilt = 25 ± 5 degrees; flip angle = 90°; 3 mm × 3 mm × 3 mm in-plane resolution). Auditory stimuli were delivered via an MR-compatible insert earphone connected to a Sony STR-DH510 digital AV control centre. After two functional runs, we obtained high-resolution anatomical images using a T1-weighted magnetisation prepared - rapid acquisition gradient echo sequence (MPRAGE; 176 sagittal slices, TR = 2730ms; TE = 3.57ms; flip angle = 7°, acquisition matrix = 224 × 256 × 176, slice thickness = 1 mm, 1 mm x 1 mm x 1mm).
Pre-processing. The first three volumes of each EPI sequence were discarded. The remaining volumes underwent spatial alignment along the AC-PC axis for each participant, followed by slice time correction using the last slice as a reference. The corrected images were then spatially realigned and registered to the? mean. The structural image was co-registered with the mean of the corrected images, aligning structural scans with SPM12 [39] tissue probability maps during segmentation. The forward deformations image from the segmentation was used to normalise the functional images to standard MNI space. Finally, the normalised functional images were resampled into 2 × 2 x 2 mm voxels and spatially smoothed using an isotropic 8 mm full width at half maximum (FWHM) Gaussian kernel.
Analysis. fMRI data were analysed in an event-related manner. Variance in each time series was decomposed in a voxelwise general linear model with the following regressors: onsets and durations of 1) words plus spontaneous laughter (Spont Laugh), 2) words plus conversational laughter (Conver Laugh), 3) words plus NV stimuli (NV), 4) vigilance trials. These regressors, along with six additional regressors representing realignment parameters calculated by SPM12 [39], constituted the full model for each session. The data were high-pass filtered at 128 seconds.
Individual design matrices were contrasted per participant [All Laughs (Spont Laugh & Conver Laugh) > NV, Spont Laugh > NV, Conver Laugh > NV], modelling the three experimental conditions across four runs and including movement parameters as nuisance variables. These contrasts were entered into a second level, two-sample t-test for the group analysis. Whole-brain analysis results were corrected for multiple comparisons using a cluster-extent-based thresholding approach [40]: a voxel-wise threshold of p < 0.001 combined with a cluster extent threshold determined by SPM12 [39] (p < 0.05 family-wise-error (FWE) cluster-corrected threshold). All reported clusters exceeded this cluster-corrected threshold. Reported cluster coordinates corresponded to the Montreal Neurological Institute (MNI) coordinate system and were labelled using the AAL labelling atlas in SPM12 [39].
Software. Pre-processing and statistical analysis were conducted in SPM12 [39], implemented in MATLAB R2018B [37]. The MarsBaR toolbox [41] was used for creating ROIs, building spherical 8 mm radius ROIs around the peak voxels in selected contrasts. Beta values were extracted for analysis.