Participants
Data from forty-eight adult participants (26 females, age range 18-27 years; Supp. Table 1) were included in the analyses for the behavioural experiment; participants were randomly assigned to one of two context conditions (single- and dual-context, each n=24). Data from twenty-two different adult participants (12 females, age range 19-25 years) were included in the analyses for the fMRI experiment; all were assigned the dual-context condition. See Supp. E. for recruitment and eligibility information. All participants provided written informed consent, and all study procedures were approved by the Institutional Review Board at the University of California, Los Angeles (UCLA).
Overview
In the behavioural experiment, participants were randomly assigned to one of the two conditions (single- or dual-context); all participants in the fMRI experiment were assigned to the dual-context condition. All participants underwent the same procedural sequence (Fig. 1): Context A encoding, Language 1 encoding in Context A, Context B encoding, Language 2 encoding in Context A (single-context condition) or Context B (dual-context condition), non-VR test (in laboratory or in MRI scanner), and surprise telephone test.
This experiment measured recall at five time-points (Times 1-5, hence T1-T5). Each language was encoded four times in the VR-based learning contexts: one initial study session followed by three test-study cycles (T1-T3) across two lab visits on consecutive days. At the end of the Day 2 visit, participants were tested outside of the VR learning contexts (T4), either in the lab or in the MRI scanner at the end of Day 2, and tested again over the telephone one week later (T5).
Virtual Reality
Two distinctive VR-based contexts were used for the learning task (Fig. 2a-d). “Fairyland Garden” was a fantasy-fiction type context that was bright, verdant, visually open, and expansive. This context’s landscape was rich with water and trees, the buildings were wooden, every room was opened to the outdoors, with birdsongs, crickets, and nature-based ambient sounds (Fig. 2a). “Moon Base,” on the other hand, was a science-fiction type context in which participants were confined indoors within the base, whose structure featured metallic walls, narrow hallways, electronic control panels, artificial colours, mechanical ambient sounds, and participants were always confined indoors (Fig. 2c). Each context contained nine named areas (hence, “rooms”); the names of each room were displayed in English on signs at the boundaries.
The VR-based contexts displayed different experimental objects during the context encoding phase and language encoding phase. During context encoding, location markers were placed in each room to demarcate the location for participants to “stand” as they encoded the context. During language encoding, interactive 3-D objects representative of the to-be-learnt words were placed on “pedestals” in each room, organised along a hinted floor path that displayed transient markers between pedestals (Fig. 2b, 2d).
An additional VR environment (Fig. 1a.1, 1a.2) was used for participants to learn to control their avatars, receive task instructions, and practice the tasks. This training environment was underwater in honour of one of the pioneering demonstrations of context-dependent memory.4 It was designed to be visually attractive and highly fantastical (e.g., swimming fishes, shifting lights), so as to allow participants time to adjust to the other-worldly nature of VR experience. This aimed to allow participant to focus on the learning tasks without being distracted by the novelty of the VR experience itself.
These VR-based contexts were created for this study using the open source OpenSimulator platform (v0.8.2.1, Diva Distribution). Firestorm Viewer v4.4.2-v5.0.7 (2014-2017) render content, presented on a computer running Windows 7 Professional. Participants navigated the world using mouse and keyboard, and used headphones with a built-in microphone to hear the stimuli and communicate with experimenters. All graphics were displayed on a 27” LED monitor. A high-resolution flatscreen display, which participants viewed in close proximity in a darkened room, instead of a head-mounted display (HMD). Our initial piloting with an HMD (Oculus RIFT DK1) found that many participants experienced eventual motion sickness that interfered with their ability to concentrate on the task. Switching to an LED monitor (often referred to as “desktop VR”) largely ameliorated this issue, although this may have led to some of our participants reporting a limited sense of “presence” in the VR worlds.
During the VR tasks, an experimenter was present to monitor the behaviour of the participant and to communicate with the participant over headphones. While experimenter and participant were in same room, they were separated by cubicle wall such that they were out of sight from one another.
Word List, Cues, and Testing
Word list. The to-be-learnt word lists were designed to be as similar, and thus as confusable, as possible. A total of 60 English words, and their translations in two phonetically similar Bantu languages—Swahili and Chinyanja—were used in the experiment. Each participant learnt to pronounce altogether 80 foreign words: 10 learnt in Swahili only, 10 in Chinyanja only, 30 in both languages. The Swahili word list was drawn from Carpenter & Olson (2012),27 and the Chinyanja versions of these words were translated using Google Translate™ and modified (see Appendix I.for the word lists and details regarding the modifications).
Audio stimuli for language learning and testing. During language encoding,audio recordings of the foreign words accompanied their written form. These recordings were pronounced by a single speaker who had no formal training with Bantu languages (J.K.-Y.E.). This was an intentional decision to ensure the foreign words were readily pronounceable by English speakers, as this experiment prioritised the memory aspect of the task over the degree of linguistic authenticity.
As Smith, Glenberg, and Bjork (1978)5 found that experimenters constituted part of the learning contexts, we took precautions to prevent uncontrolled context reinstatement by virtue of subject-experimenter interactions. First, a single speaker recorded audio for both languages during the learning task—to ensure that speaker identity or voice would not serve as context cues between the languages. Every attempt was made by this speaker to not speak to participants during experimental procedures. Second, tests that were conducted outside of the learning contexts were cued by other speakers. The English audio cues used in T4 were recorded by A.O., and T5 was conducted by a team of research assistants.
Testing software. The short-delay non-VR test (T4; Fig. 4) was presented using PsychoPy2.41,42 The long-delay surprise memory test was administered over telephone calls using Google’s Hangouts™ communication platform (audio-only), digitally recorded with participant permission, with foreign vocabulary recall cued conversationally by experimenters.
fMRI Protocol and In-Scanner Verbal Response Recording
fMRI protocol. fMRI data were collected with a Siemens 3.0 Tesla Magnetom Prisma scanner at the UCLA Ahmanson-Lovelace Brain Mapping Center, using a 64-channel head coil. Functional data were acquired using T2*-weighted simultaneous multislice echoplanar imaging (EPI) sequences (TR = 1.0 s; TE = 30 ms; flip angle = 52°; FoV = 20.8 cm; multiband acceleration factor = 5; 65 oblique axial slices; voxel resolution 2 x 2 x 2 mm). Each of the 10 runs consisted of 330 volumes and included eight trials of the task (we did not discard initial volumes as the version of Syngo software did not begin recording until T1 stabilised). Additionally, a T1-weighted structural MRI [axial magnetisation-prepared rapid gradient-echo (MPRAGE), 0.8mm3] was obtained for spatial registration of the functional data.
Auditory stimuli were presented via OptoActive™ noise cancelling headphones, which were equipped with the FOMRI III™+ microphone (Fig. 1c) to record participants’ verbal responses during fMRI scans. This system provided online noise cancellation, which enabled high-quality recordings of participants’ vocalisations and allowed participants to clearly hear the audio stimuli despite the scanner noise. No post-experimental denoising of the verbal response was required. Button responses were recorded via CurrentDesign Fibre Optic Response Pads, an MR-compatible button box device. MR-compatible goggles were used to for visual presentations.
Procedure: Day 1 and Day 2, Context and Language Encoding (T1-T3)
Day 1.
Familiarisation, Instructions, and Practice.After informed consent and general instructions, participants “entered” the introductory VR environment. Therein, participants first familiarised themselves with the navigational controls. They then received instructions for the context- and language encoding tasks by watching a video on a screen within the world (Fig. 1a.1), and practiced the two tasks (Fig. 1a.2) under the supervision of an experimenter, who provided corrective feedback to ensure that participants had proper understanding of the tasks. Participants practiced the context encoding task (see below) by performing it in the practice context. Then they practiced the language encoding task by learning the translations of a set of practice items in the pseudo-language ‘Pig Latin’.
Context A Encoding (Fig. 1a.3). Participants were then “teleported” to Context A (Moon Base or Fairyland Garden, counterbalanced across participants), where they performed a guided encoding task of the VR-based context itself. Each context contained 9 “rooms,” each equipped with a location marker. In each room, participants were instructed to walk to the marker and do two full clock-wise rotations (720°) within 30 s while looking around the room. Participants were instructed to pretend that they were a tourist who had forgotten their camera and that they should try to remember what it felt like to be in that particular place. As participants entered and exited each room, the experimenter informed participants the names of the rooms (e.g., “You are now leaving Sickbay and entering Airlock.”).
Language 1 Encoding (T1-T2; Fig. 1a.4). There were four rounds of language encoding for each language (three rounds on Day 1, and one on Day 2). Before each round, participants were told which language they would be learning. After Context A encoding and a mandatory 2-min break, participants re-entered Context A for Round 1 of Language 1 encoding (Swahili or Chinyanja, counterbalanced across participants).
In each round, participants navigated along the hinted walking path (Fig. 2b and 2d) and encountered a series of 40 pedestals (with 3-5 pedestals in each room). Upon each pedestal hovered a slowly rotating, 3-D object representation of the to-be-learnt word (e.g., a rooster), with its English name floating above to ensure that participants could have certainty about what that object was (i.e., so they knew it was not a hen or turkey). As Fig. 2e denotes, participants were instructed to walk up to each object, read its English name aloud, and then to “touch” it (i.e., click on it). The touch changed the floating English text to reveal the foreign transliteration, and participants would hear the foreign pronunciation three times via headphones, evenly spaced across 10 s. Participants were instructed to repeat after the audio each time by pronouncing the foreign word aloud. Upon completion, they would then touch the pedestal to reveal a visible path marking the way to the next pedestal with the next object. The path hints were transient and disappeared after use. Object sequences were controlled so that they were consistent within each language. That is, for a given participant, the same object always appeared in the same location for one language, but always in a different location for the other language. The pedestal locations and navigational route remained consistent across all rounds. A 5-min break was inserted between Rounds 2 and 3.
Retrieval Practice (Fig. 2e.2). Retrieval practice was incorporated into Rounds 2-4. During Rounds 2-4, after participants walked up to each object and spoke aloud its English name, they were to first attempt to verbally recall its foreign translation before touching the object. If the participant did not recall the translation and did not wish to attempt a guess, they had the option to say “pass.” They then touched the object, which triggered the transliteration of the foreign word to the appear and the audio of its pronunciation to be played. Thus, regardless of whether they were correct, incorrect, or passed, the participant received feedback as to the correct answer. Then, as with Round 1, participants heard and repeated after the audio three times within a 10 s period. Participants’ verbal responses were digitally recorded and used to index their memory recall ability during each round, with performance summarised as: T1 (recall during Round 2 before the 2nd encoding), T2 (recall during Round 3 before the 3rd encoding), and T3 (recall after an overnight delay, before the 4th and the final encoding). In the rare cases when participants neglected to attempt recall or say “pass” before touching an object, the associated vocabulary words were dropped from analysis after that time point. For example, consider a participant who touched the 3-D boat object during Round 3 before attempting to recall the Swahili word for “boat.” Even though the participant would continue to encounter the boat in Round 4 to maintain consistency across participants, that word would be excluded in analyses of that participant’s T3, T4, and T5 data.
Context B Encoding (Fig. 1a.5). AfterRound 3 of Language 1 encoding, participants encoded Context B. The procedure was identical to Context A encoding, except it occurred in the other VR-based context. This was followed by a 5-min break.
Language 2 Encoding (T1-T2; Fig. 1a.6). After the break, participants began Language 2 encoding. This isthe only portion of the procedures in which the experiences of the two context groups diverged. Dual-contextparticipants remained in Context B to encode Language 2, while single-context participants were teleported back to Context A to encode Language 2 (note that single-context participants never learnt any language in Context B). The encoding procedure was identical to Language 1 encoding.
Post-VR Questionnaires. Thereafter, participants completed the Virtual Presence Scale on REDCap, an immersion survey (this survey was not used in the analysis)18,43, the Simulator Sickness Questionnaire44, and the Pittsburgh Sleep Quality Index.45 They were then reminded of their appointment the next day, and sent home.
Day 2.
Participants returned the next day around the same time of day to perform Language 1 Encoding Round 4 (T3). Then, following a 2-min break, participants performed Language 2 Encoding Round 4 (T3). Round 4 was participants’ last exposure to the foreign words and VR contexts.
Procedure: Day 2, Short-Delay, Non-VR testing (T4)
Language encoding was followed by a 10-min break (behavioural experiment) or 30-min break (fMRI experiment), after which participants were tested for the first time outside of the VR-based learning contexts (T4), either in the lab (behavioural experiment) or in the MRI scanner (fMRI experiment). During the break, participants in the behavioural experiment were unoccupied for 10 min under supervision, seated in a waiting room without using internet-capable devices. A 30-min interval was scheduled for participants in the fMRI experiment. During this time, each participant was escorted by their experimenter to the Ahmanson-Lovelace Brain Mapping Center (an 8-min walk from the laboratory), underwent final MRI safety screening, and was set up in the MRI scanner.
T4 consisted of 80 trials (one for each foreign word learnt) evenly divided into 10 runs. Each trial (Fig. 4) consisted of the following periods: “Ready” screen, mental reinstatement, language recall, imagery vividness rating, and two trials of an arithmetic task that served as active baseline for fMRI data analysis. T4 procedures were identical in the behavioural and fMRI experiments.
Ready (1 s). A grey screen with the words “Get Ready” printed was presented to mark the beginning of each trial.
Mental Reinstatement (10 s).The mental reinstatement period began with an audio cue for each trial, which stated the name of a VR-based context, followed by that of a room therein (e.g., “Moon Base: Airlock”). Following the audio cue, the screen turned black, and based on instructions provided to the participants before the scan, they knew that this meant that they should close their eyes, imagine themselves back in that specific room, and mentally perform the full rotations (as they had practiced the prior day in the VR-based context encoding task) until they heard a beep. Participant used a series of button presses to indicate the progress of their imagined rotation: mentally “placed” themselves on the marker, rotated 180°, 360°, 540° and so on. If participants completed a full rotation before the allotted time, they were instructed to continue mentally rotating and button-pushing until the beep. Upon hearing the beep, which sounded 10 s after audio cue offset, participants were to cease performing the mental rotation task and open their eyes to prepare for the next phase of the trial.
In the congruent reinstatement condition, participants were cued to reinstate the specific room in which they had learnt the word to be recalled later in this trial. In the incongruent condition, they were cued to reinstate a room from the other context (for dual-context participants, this was the context where they had learnt the other language; for single-context participants, this was the context where they had not encoded any language).
Language Recall(8 s). The language recall period began 2 s after the onset of the previous beep. Participants first heard an audio cue, which stated a language, then an English word whose translation they had learnt in the stated language (e.g., “Chinyanja: rooster”). After hearing the cue, participants were to covertly retrieve the English word’s translation in the cued language (i.e., to mentally recall the foreign word without saying it aloud). If they felt they were successful, they were to push Button 1 and to continue thinking about the word until they heard a beep. If they failed to retrieve the foreign word, they were to push Button 2 and continue to attempt retrieval until the beep—should they succeed at any point after indicating failure, they were to push Button 1 at the moment of successful retrieval. The beep sounded 8 s after the cue offset, at which point participants were to verbally pronounce the foreign word, or as much of it as they could remember. These responses were recorded and scored as T4 data. The length of the verbal response recording period varied between 6.5-7.0 s depending on the length of the cue (3.0-3.5 s), so that the combined duration of the two always summed to 10 s.
Imagery Vividness Rating(2 s). After verbal recall, participants were then asked to rate how vivid the previous mental reinstatement had been (1 for very vivid, 2 for vivid, 3 for not vivid, and 4 for unsuccessful). These ratings were later used for trial exclusion during the analyses involving mental reinstatement.
Arithmetic Task(5 s).At the end of each trial, participants performed an arithmetic task. Participants saw a display (2.5 s) with two single-digit integers, and they were to push Button 1 if the product of these numbers was odd, and Button 2 if even. Then a new pair of digits appeared (2.5 s) and participants performed the same task.
Procedure: Day 2, Post-experimental Survey
After T4, participants completed a short survey to ask them about what strategies (if any) they had implemented to learn and recall the words, and if there was anything else they would like to communicate to the experimenters.
Procedure: Day 8, One-Week Delay, Surprise Testing (T5)
On Day 8, participants were telephoned for a scheduled “follow-up interview” with the understanding that an experimenter would “ask them about things they had experienced in the VR.” The only instructions they received about the phone call was that they were to be at home, seated in a quiet place. Participants were not informed that they would be tested again.
During the call, the experimenter requested permission to record the participant’s responses. After permission was granted, experimenter asked the following questions: (1) Had they looked up or studied any of the Swahili or Chinyanja words during the preceding week? (2) Had they expected to be tested again? (3) What percentage of the words did they expect to recall? (see Supp. C).
The experimenter then conducted a cued recall test to test participants’ memory for all 80 of the foreign words they had learnt. On each trial, the experimenter cued the participant with an English word and a language that it was to be translated into (e.g., “How do you say ‘cherry’ in Swahili?”). The order in which the words are tested was fully randomised, such that testing hopped back and forth between the two foreign languages. Participants’ vocal responses were recorded and scored as T5 data.
Language Test Scoring
Recall. Digital recordings of the verbal responses from T1-T5 were scored offline by two scorers. The score for each word was the number of correct phonemes divided by the number of total phonemes. Scorers were trained to use a detailed decision tree, and when the two scorers disagreed, the average between the two scores was used as the final recall score for that word. The partial word score was used to provide more fine-grained results than binary (correct vs incorrect) word recall. In this scoring scheme, phonemes in shorter words were weighed more heavily than phonemes in longer words. This weighting mirrors the consequences of phonemic errors in real-world communication. When one mistakenly places, for instance, a “P” instead of an “V” in the word “van” it tends to be more consequential than in a longer word like “supervisor,” and a lot more difficult for the listeners to guess the intended meaning.
Retention Measures. Retention was measured inversely via a forgetting score between two tests. Overnight retention (reported in Supp. A4) was computed based on the difference between T3 and T2. One-week retention was computed based on the difference between T5 and T4.
Forgetting Score. The forgetting score was computed as followed. First, an item-wise forgetting index was computed for each word with a non-zero score in the earlier test (i.e., if no phonemes were recalled in T4, the word was excluded from this computation for one-week forgetting). These forgetting indices measured loss between the two tests: a negative forgetting index would mean the word was recalled worse after one-week, and a forgetting index of zero would mean no forgetting, thus perfect one-week retention. For example, a word had a recall score of 1 (full, correct recall) on T4, but only 0.5 (half of the phonemes were missing or incorrect) in T5. It would receive a “-0.5” on the forgetting index, indicating half of the word had been forgotten. On the other hand, if a word had a score of 1 on both T4 and T5, it would receive a “0” on the forgetting index, indicating perfect retention. These forgetting indices were then averaged within each participant (across all eligible words) to produce a forgetting score. The forgetting score was a metric of forgetting, or the inverse of retention—the more negative the score, the more forgetting and thus the poorer retention.
Retention Score. For the ease of interpretation, a positive retention score was computed by 1 minus averaged forgetting score. In which 1 indicates perfect retention across all eligible words, 0.5 indicates half of the information was retained, while 0 means no information were retained.
Intrusion Measure.When scoring T4 and T5, scorers were instructed to compare the transliteration of each word to its counterpart in the other language, and to determine from experience whether the word in question was similar to any other words in either language (see Supp. Appendix IIfor intrusion coding). The scorers were experimenters who became highly familiar with the words in both languages. In addition to formal training, scorers spent 2-6 hours each week monitoring participants during language encoding, testing participants during T5, or scoring verbal response offline. Despite of this, “similarity” between words remains arbitrary and experience based. Therefore, two cautions were introduced: a newer scorer was always paired with a very experienced one in the scoring assignments, and the maximum code was used when the scorers disagreed—as the higher ratings denote more severe intrusions, and preliminary examination revealed that novice scorers tend to underrate intrusion rather than overrate them.
Behavioural Data Analysis
Multiple statistical tests were conducted using SPSS 26.0.46 The between-subject factors were Context Group (single- vs. dual-context) and Presence (high- vs. low-presence, a mean-split grouping using the Virtual Presence Scale19). The within-subject factors were Times (T1-T5), Language Order (Language 1 vs 2; not reported, see Supp. A1), and Reinstatement (congruent vs. incongruent reinstatement). The dependent variables were intrusions (number of items coded to be intrusions from the opposite language, out of a total of 80 items), recall (mean of item-wise percentage phonemes correct for a given test), and retention (see Retention Score above).
fMRI Data Analysis
fMRI Pre-processing. Functional data were pre-processed without spatial smoothing, pre-whitening, nor B0 unwarping using the FMRI Software Library 5.0.4 and Advanced Normalisation Tools (ANTS 2.0)47. FSL Brain Extraction Tool (BET2)48 was used to perform brain extraction. FSL49 FEAT50 was used to apply a high-pass temporal filter (128 Hz). Timeseries alignment, motion correction, and registration to standard Montreal Neurological Institute (MNI) template was performed using FMRIB's Linear Image Registration Tool (FLIRT),51–53 Motion Correction FLIRT (MCFLIRT),51 and ANTS.
fMRI Task Timing and Trial Categorisation. The mental reinstatement (Fig. 4 “Imagery”) and language retrieval (Fig. 4 “Language”) periods from each trial were extracted from the dataset. The BOLD timeseries for these periods were extracted using the adjusted onset and offset times (5 s, i.e., 5 TRs, were added to onsets and offsets to account for the lagging hemodynamic response, or HDR). The resulting truncated timeseries was then temporally averaged at each voxel, yielding one averaged imagery pattern and one averaged language pattern for each trial.
Imagery. Each “Imagery” period began when participants indicated that they had mentally “placed” themselves in the to-be-reinstated context via a button push (Fig. 4“Orient”), and end at the beep onset (the beep which informed participant to open their eyes and end mental reinstatement). The onset for each trial was based on participant responses, thus the imagery period duration varied in length. Imagery period data were labelled as Moon base or Fairyland Garden, based on the world that participants were cued to reinstate. Trials were excluded if participants reported they were “unsuccessful” during the imagery rating portion, or did not push buttons to report mental reinstatement rotation progress.
Language. Each “Language” period began with the onset of the audio cue, and ended 6 s afterwards. The duration of this period was task-based, and fixed in length. Language period data were labelled by the foreign word to be recalled (e.g., Chinyanja: Dress).
Searchlight Multi-Voxel Pattern Analysis (SL-MVPA). A SL-MVPA was conducted using the Imagery patterns to identify regions in the brain that expressed multivariate patterns of activity capable of discriminating between a participant’s mental reinstatement of Moon Base vs. Fairyland Garden (Supp. Fig. 1). To this end, we employed a support vector machine (SVM) classifier with a linear kernel using libSVM (nu-SVC, c=1)54 and a whole-brain searchlight mapping approach (radius = 4 voxels). Classification was cross-validated using a leave-one-run method—the classifier was trained on valid trials from 9 runs (9 x 8 trials), and tested on the valid trials from the left-out run (8 trials). Trial labels were balanced prior to classification by randomly sampling from the overrepresented trials to match the underrepresented trial types. The entire cross-validation procedure was repeated over 10 iterations (one for each run) and the classification results were averaged. This produced a brain map whose voxel values reflected the classifier’s cross-validation accuracy when the searchlight sphere was centred on that voxel (Supp. Fig. 1.4). The top 2000 voxels with the highest classification accuracies were identified for each participant, and used to create a distributed region of interest for the subsequent representational similarity analysis as a within-subject feature selection (Supp. Fig. 1.5).
Representational Similarity Analysis (RSA). For each word that each participant had learnt, the RSA produced a value of similarity between (1) the brain response pattern when participant was recalling this word, and (2) the averaged brain response pattern when participant was mentally reinstating that word’s learning context (Fig. 5a).
This within-subject RSA was conducted using custom MATLAB code. First, trial-specific imagery and language patterns (produced by the aforementioned temporal average of HDR-adjusted timeseries within trial period) for each participant were masked using the participant’s top 2000 voxels identified in the SL-MVPA. Second, the imagery patterns for each learning context were averaged within-subject to produce a participant-specific mental reinstatement template for Moon Base and Fairyland Garden. Third, the language pattern for each word were then correlated (Pearson) with the reinstatement template of its learning context. For instance, a participant had learnt “banana” in Chinyanja in Fairyland Garden. The language period during the covert retrieval of the word “banana” in Chinyanja would be correlated with the Fairyland Garden template—an average of all imagery patterns during the mental reinstatement of Fairyland Garden. Fourth, the resultant r-values were transformed (Fisher Z-transformation) to normally distributed z-values to allow for comparison across trial-types. Lastly, a mean split was performed on the z-values to categorise each trial as either a high-fidelity reinstatement trial or a low-fidelity reinstatement trial to analyse the verbal response data.
Repeated Measure Analysis of Variance (RM-ANOVA). A 2 × 2 × 2 × 2 RM-MANOVA was performed on with the factors Times (T4, T5) × Reinstatement instructions (congruent vs incongruent) × RSA (high- vs low-RSA) × Presence (high- vs low-presence) on recall using SPSS 26.0.46 The dependent variables were proportion syllables recalled during T4 (short-delay recall in the MRI scanner) and T5 (one-week-delayed recall over the telephone).