Code Availability
The de-identified data, analytic code, and materials for this study are all available on our Open Science Framework (OSF) page (link in Data Availability section). The analyses were conducted in R (version 4.2.1), using the lme4 library (version 1.1-30) and the emmeans library (version 1.8.0). The study design, hypotheses, and analytic plan were all preregistered on OSF. Sample size determination, data exclusions, manipulations, and measures are detailed below.
Participants
To our knowledge, no relevant prior studies exist that evaluate the effects of a knowledge intervention on older adults’ segmentation ability. While Blasing (2015) reported significant intervention effects in college students (Experiment 2), no effect sizes or measures of variance were reported41. Therefore, we were unable to conduct an accurate power analysis. Given that older adults exhibit more variability in segmentation and memory performance, we decided to recruit a much larger sample than that reported in Blasing (2015; n = 8 per group) to detect effects of knowledge on segmentation within older adults. Study procedures were approved by Kansas State’s Institutional Review Board. All methods were performed in accordance with the relevant guidelines and regulations. Informed consent was obtained from all subjects prior to their participation.
Participants aged 65-85 years were recruited from an online database maintained by the Memory & Aging lab at Kansas State University as well as through various local and national volunteer organizations. A total of 80 older adults (56 women, 24 men) participated with an overall average age of 71 years (SD = 5.4). Recruitment and data collection for this study took place in 2021. Given that this was an online study, participants were recruited from across the country (i.e., Kansas, Missouri, Texas, Ohio, Illinois, New York).
Table 1 contains the demographics of each group. Independent-samples t-tests with equal variance compared the two groups for each of the continuous variables (age, MoCA score, and education in years). Semantic fluency was calculated from the number of unique items produced in two measures (category and letter fluency) given during the pre- and post-intervention tests (see section “Semantic Fluency” below). No significant difference between groups was found.
Table 1
Demographic information by workshop group
|
Gyoza Group (n = 38)
|
Tai chi Group (n = 42)
|
Independent Samples t-test
|
Gender
|
24 Females/ 14 Males
63.15% Females
|
32 Females/ 10 Males
76.2% Females
|
|
Racial distributions
|
37 white, 1 more than one race
|
39 white, 1 Asian, 1 more than one race, 1 prefer not to answer
|
|
Age
|
71.00 (SD = 4.74)
|
71.00 (SD = 6.06)
|
t(78) = 0.00, p = 1.00
|
MoCA
|
13.23 (SD =1.30)
|
13.61 (SD = 1.20)
|
t(78) = -1.36, p = 0.17
|
Education
|
16.55 (SD = 2.85)
|
16.8 (SD = 2.93)
|
t(78) = -0.39, p = 0.69
|
Semantic Fluency
|
42.20 (SD = 9.20)
|
42.02 (SD = 10.02)
|
t(153) = 0.11, p = 0.91
|
Note. Education = average number of years of education; MoCA = average score on the mini-MoCA test; Semantic fluency = average number of items produced on category and letter fluency measures (measured at pre-test and post-test).
Figure 7 is a schematic for the recruitment numbers and the experimental design. To be eligible for the study, potential participants had to complete an expertise survey that tested knowledge of two activities: Tai chi, a low-impact Chinese martial art, and gyoza-making, preparing Japanese dumplings (see Supplemental Materials).
The survey was designed by the expert responsible for instructing each workshop. It was composed of nineteen (Tai chi) and twenty (gyoza-making) multiple choice questions, which were based on the materials that were going to be covered during the training workshops. No feedback on the scores was provided to participants regarding correct answers nor their scores. All topics in the expertise survey were taught during the respective workshops. Participants would then retake the expertise survey during the last portion of this study to evaluate learning from the workshops.
Two-hundred and twenty-five participants completed the survey but sixty-four of those scored higher than our cutoff score (> 10 correct answers for each activity).They were deemed too knowledgeable and were excluded. Participants that scored less than 60% correct in a test were deemed novices.
If a potential participant was deemed a novice in both activities, a research assistant scheduled a phone call to determine whether the participant was cognitively healthy and physically able to participate in the activities. The phone screening lasted approximately thirty minutes and covered questions related to demographics, medical history, contact information, and two brief dementia screening measures: AD8 (score > 2 = ineligible)60 and Short Blessed (score 0-4 is healthy cognition; >10 = ineligible)61. In addition to being novices in the targeted activities, the eligibility requirements excluded people with significant visual problems, enduring neurological problems, unmedicated psychiatric disorders, excessive drinking, use of benzodiazepines and other memory-interfering drugs. Additionally, only fluent English speakers were included to ensure comprehension of the verbal memory tests and communication during workshops. Finally, to evaluate overall cognitive functioning, the participants completed the brief version of the Montreal Cognitive Assessment (Mini-MoCA)62 via Zoom platform. This version of the MoCA was selected due to its reliability for remote screening of cognition. Prior to using Zoom, participants were offered Zoom training, ensuring that each participant felt comfortable with the platform. Participants were scored by a certified MoCA rater using the official instructions for the Mini-MoCA version 2.1. Participants were not informed of their score during the screening process and were allowed to participate in the workshops regardless of their scores. However, only participants with a score of eleven or above (out of fifteen) were officially included in the data analysis of this study, which is considered normal cognition (reference MOCA official instructions version 2.1). One person assigned to the Gyoza group was excluded from the data analyses due to an insufficient MoCA score, but they completed participation. Participants were compensated $50 for their participation.
Materials
Videos
Participants were first asked to watch a practice video of a person building a boat from Duplo blocks (duration = 155 seconds) to become familiar with the experimental procedure. Then they watched two additional videos that were filmed from a fixed perspective without zoom and without audio. Each video contained one actor performing one activity. One video was of an actor performing Tai chi movements (duration = 307 seconds) and the other video was of an actor making gyozas (duration = 411 seconds; see Supplemental Materials Fig. S7 for stills from both videos).
Segmentation Task
We used the unitization task originally reported by Newtson (1971)63 to assess participants’ event segmentation ability. While watching the videos, participants were asked to press their keyboard’s spacebar each time they believed one natural and meaningful unit of activity in a video ended and another one began (as was done in Zacks et al., 200154). Participants practiced the segmentation task during the practice video described above. To shape participants’ segmentation behavior by helping them focus on events of a similar grain size across the sample, participants needed to segment the practice video at least three times. Participants were not aware of this value. Those who segmented fewer than three times were simply asked to identify a few more units of activity, and the practice video was repeated until they segmented three or more times. Once they met these requirements, the program moved on to the segmentation task for the two experimental videos.
The number of meaningful boundaries identified by a participant (i.e., segmentation count), operationalized as the number of times they pressed the keyboard’s spacebar during a video, was recorded. This value was used as a predictor variable, to evaluate whether the number of perceived events influenced participants’ segmentation agreement and memory performance.
Segmentation agreement is the degree to which one participant’s identified event boundaries align with those identified by others. In our analyses, we used each activity’s knowledgeable group at post-test (those who went through the related workshop) as the reference group to which all other segmentation similarity values would be compared. This is because the knowledgeable groups for each activity should have the most informed normative event boundaries. In other words, for the Tai chi video, each participant’s event boundaries at pre- and post-test were compared with the boundaries identified in the post-test by those who completed the Tai chi workshop . For the gyoza video, individual’s event boundaries at pre- and post-test were compared to the boundaries identified by the Gyoza group at post-test. This was done so that all segmentation agreement values would be calculated in terms of the same reference point.
Segmentation agreement was calculated by fitting a one-second Gaussian Kernel function around each participant’s button press (i.e., each perceived event boundary). By doing so, each frame of the video receives a smoothed likelihood (on a scale of zero to one) that the participant perceived an event boundary on that frame. Normative event boundaries were identified by averaging the event boundary probabilities from each “knowledgeable” participant on the post-test segmentation task. The correlation between each participant’s event boundary probabilities and the knowledgeable group’s normative boundaries was then calculated. Importantly, this was calculated for each video using a leave-one-out procedure––i.e., a participant’s own post-test segmentation responses were not included in the normative boundaries calculated for their comparison. For more on this segmentation agreement calculation, see Pitts et al., 202237 and Newberry et al., 202144.
Free Recall Task
Participants were instructed to recall the video they had just watched and type out a description of that video. No other prompts were provided. This typed free recall procedure has been used with older adults in previous studies35,37,52. Free recall performance was scored using the Action Coding System (ACS)35,37,52,64. The ACS provides a way of coding free recall of complex activities by grouping actions into larger goals called A2 units (e.g., assembles gyozas) and smaller subgoals called A1 units (e.g., picks up one wrapper, puts spoonful of food onto the wrapper, moistens edge of wrapper, folds wrapper around food, pleats wrapper edges). The gyoza video consisted of 16 A2 units and 86 A1 units, whereas the Tai chi video consisted of 19 A2 units and 159 A1 units. To score the data, two trained, independent raters both scored the recall data from 3 participants and their reliability was calculated (for A1 units, the average inter-rater Kappa = 0.77; for A2 units, Kappa = 0.88). Discrepancies were discussed and resolved. Then, the raters were assigned to score all of the data for a particular video. Performance was scored as the proportion of action units correctly recalled (i.e., total action units correctly recalled divided by total action units in the video) for both A1 and A2 units. These proportions were strongly correlated (r = 0.84, see Table S7 in Supplemental Materials for correlation table of all memory measures), so we averaged these values and created a composite recall variable. Thus, each participant received an average proportion correct for recall performance.
Recognition Task
The recognition task consisted of twenty two-alternative forced choice (2AFC) trials (see Supplemental Materials Fig. S8 for visualization). For each trial, the participants saw two images: One of these images came from the video the participant just watched, and the other image came from a different video filmed in the same location with the same actor. The participants were instructed to click on the image that came from the video they had just watched. Recognition performance was scored as the proportion of correctly identified images. This 2AFC discrimination between real and foil video stills has been used previously as a measure of event memory36,40, including usage with older adult participants35,37,52.
Order Memory Task
For this task, participants were given sixty-six 2AFC trials (see Supplemental Materials Fig. S8 for visualization). However, this time both of the images came from the video the participant had watched and the participant was instructed to identify which image came first in the video. Order memory performance was scored as the proportion of correctly identified images. This 2AFC task is a modified version of the order memory task used by Zacks et al. (2006)36. This modified version has been used previously with older adults by Smith et al. (2020)52 and Pitts et al. (2022)37.
Semantic Fluency Tasks
Participants also completed two semantic fluency measures in the pre- and post-intervention tests. In the category fluency measure, they were asked to type as many objects of a certain type as they could in sixty seconds (“animals” in pre-test and “vegetables” in post-test). In the letter fluency measure, they typed as many words starting with a certain letter as they could in sixty seconds (“S” in pre-test and “M” in post-test). Order of the type of fluency test was counterbalanced across participants. Performance was calculated as the total number of unique items produced. This task served as a distractor task that separated the video watching and memory measure (free recall, recognition, and order) stages of the experiment. It also served as a measure to ensure that one workshop group did not excel more than the other in semantic fluency or typing ability, both of which could impact the evaluation of the free recall responses.
Knowledge Training Workshops
To provide participants with knowledge and experiential learning of an activity, they took part in a two-session workshop. Each session was one hour long and took place two days apart (e.g., Tuesday and Thursday of the same week) to prevent fatigue. A script was developed for the gyoza and Tai chi workshops based on their corresponding expertise survey to guarantee inclusion of all relevant information. All script information fit in the one-hour workshop; therefore, participants had the opportunity to learn the material twice during the training.
The workshops took place online via the Zoom platform with a live instructor that provided feedback to participants as they were completing their training. Participants completed the workshops in small groups ranging from one to six people. Group size was based on a participant’s availability and capped at six per group so that the workshop instructor and research assistant were able to closely monitor all participants. The research assistant was also responsible for monitoring participant’s cameras and connection to ensure that they were present and effectively connected to the Zoom meeting.
Gyoza workshops. The gyoza workshops started with an introduction to the activity, which covered gyoza origins, similar foods from other cultures and traditional ingredients. It also included specific details such as wrapper thickness, amount of filling, and number of pleats in each gyoza. Prior to the workshops, participants either bought wrappers or made their own, and they also made a filling of their preference. The instructor demonstrated how to make your own wrappers, and how to properly assemble each gyoza. Then participants were given the opportunity to practice assembling their own gyoza. After making a batch, the instructor would switch to the cooking portion of the workshop, which happened during both sessions. Although all participants were required to practice assembling gyozas, they were not required to cook them since some did not have access to a full kitchen when participating in the workshops.
Tai chi workshops. The Tai chi workshop also started with an introduction to the activity, including its history, health benefits, and key components of this type of martial art. Participants were then taught a sequence of Tai chi moves that could be done either standing or sitting, depending on the participant’s preference. The sequence included seven different movements performed three times on each side of the body, moving from head to toe. In each workshop, the instructor demonstrated the complete sequence three times with participants following along with the moves.
Procedure
Due to the COVID-19 pandemic, this entire study was conducted online. If participants met all eligibility criteria described above, they were sent the informed consent form to read and sign. On the consent form, participants were able to indicate whether or not they were familiar with Zoom. Participants that reported they were not familiar with the video conferencing platform were offered training on how to install and utilize the software. Then, to guarantee that all participants would be able to connect to the online workshops and use the conferencing platform’s basic functions, the mini-MoCA assessment was conducted via Zoom after an effective connection was established (audio and video functioning). Despite these measures to ensure technical success, there were two participants who chose to discontinue the study due to connection or other technology-related difficulties (see Fig. 7).
After completing the mini-MoCA, participants received a schedule of their study sessions through an email, including links to Pavlovia.org, the platform used to host the pre- and post-tests, along with the Zoom links for the two workshops. Utilizing the Pavlovia platform required no additional training from participants. After clicking on the Pavlovia link the experiment would begin and instructions for the segmentation and memory tasks would be provided on-screen, just as they would be in an in-person study. During the pre-test, participants watched and segmented the practice video followed by the two experimental video blocks. To proceed to the experimental video blocks, participants had to segment the practice video using at least three segments. In each video block, they watched and segmented a video (e.g., Tai chi or gyoza video), and then they completed one of the semantic fluency tasks as a task separating the video presentation from the memory measures. Finally, they completed the memory measures in this order: free recall, recognition, and order memory. After completing the memory measures, participants moved on to the second video block. The order of videos and order of the semantic fluency tasks was counterbalanced across participants and sessions. Participants completed their pre-tests between four to six days prior to the workshops.
Participants were then randomly assigned to one arm of the study (gyoza-making or Tai chi). One participant dropped out of the study after group assignment because they preferred the other activity (see Fig. 7). The remaining participants took part in two workshop sessions, separated by one day, on Zoom. Between one to two days after the final workshop session, they completed their post-tests on Pavlovia.org. The procedure was the same; however, video order was counterbalanced (if they saw gyoza-making first during the pre-test, they saw Tai chi first during the post-test). At the end of the post-test, participants completed the same expertise survey questions that they were given prior to enrolling in the study to assess how much semantic learning occurred within the workshops. Finally, the participants were debriefed and compensated for their time.
Analyses
For all analyses, we conducted multilevel models that incorporate both fixed effects of Group (Tai chi group vs. Gyoza group), Time (pre- vs. post-test) and Activity (Tai chi vs. Gyoza) as well as random effects (detailed below). These analyses were conducted in R (version 4.2.1) using the lme4 library65 and the emmeans library66 was used to calculate the least-squared means for each model. The random effect structure of each model was independently chosen as that which produced the best fitting model (identified as the model with the lowest Akaike Information Criterion value67). Further, each multilevel model used a different distribution based on the type of data: Poisson distribution used for count data (e.g., segmentation count), binomial distribution used for binomial data (survey data, all memory measures) and linear distribution used for data with linear relationships (segmentation agreement). All binomial logistic regressions were run with proportions as the predicted variable, using a weight value to inform the models of how many trials there were in total.
Of the tested random effect structures, participant was included as a potential intercept effect (reflecting the random variability of each participant sampled from the population), and Time was allowed to vary as a random slope effect (as each participant could have a unique change in slope over time).