Impact of the “Thinking while Moving in English” intervention on primary school children’s physical activity and academic outcomes: A cluster randomized controlled trial

Background: The majority of children internationally are not sufficiently active. Integrating physical activity into academic lesson time may not only help increase children’s activity levels but also improve learning. The aim of this study was to assess the impact of a classroom-based physical activity intervention on primary school students’ physical activity levels and academic outcomes (i.e., on-task behavior, literacy and executive function skills). Methods: This cluster randomized controlled trial included students ( N = 283) from nine primary schools located in New South Wales, Australia. Schools were allocated to a control (n = 5 schools) or Thinking while Moving in English (TWM-E) (n = 4 schools) conditions. Teachers received professional learning (i.e. one day workshop), resources (e.g. drill ladders, lettered bean bags), and mentoring from the research team. Teachers in the TWM-E condition delivered three 40-min physically active English lessons per week for six weeks, whereas the control condition continued with the usual lessons. For both conditions, assessments occurred at baseline and post-test (six weeks). Children wore accelerometers on their wrists (Axivity) for one week during school time to measure their intensity of physical activity intensity (primary outcome). On-task behavior was assessed using a momentary time sampling procedure and expressed as a percentage of lesson time. Standardized tests were used to assess literacy skills (i.e., spelling, grammar and punctuation) and executive functioning (i.e, inhibition and task shifting). Results: No significant group differences were observed for physical activity outcomes, spelling or executive functions. Compared to the control group, the TWM-E group displayed improved on-task behavior (adjusted mean difference = 18.1% of lesson time, 95% CI [10.12 to 26.02], p < .001, d = 0.53), and improved grammar and punctuation scores (adjusted mean difference = 3.0, 95% CI [ 0.7 to 5.4], p = 0.018, d = 0.16). Conclusions: Despite minimal effect on overall physical activity levels, active lessons have important benefits for students’ on-task behavior and literacy. Egger et Cognitively engaging chronic physical activity, but aerobic exercise, affects executive functions in children: A group-31


Introduction
A wealth of studies support the benefits of physical activity for children's psychological and mental health, and educational outcomes such as school engagement, cognition, metacognition, and academic achievement [1][2][3][4] . A recent systematic review suggested that the time, type, context and complexity of physical activity may trigger a variety of mechanisms that mediate its relationship with cognitive function 5 6 . Neurobiological (e.g., enhanced release of brain-derived neurotrophic factor and increased grey matter volume), psychosocial (e.g., physical self-perceptions, mood and emotions) and behavioral (e.g., sleep volume and quality, self-regulations skills) mechanisms may explain the positive effects of physical activity on young people's academic outcomes 5 .
Moreover, there are quantitative and qualitative aspects of physical activity that may moderate the effect of physical activity interventions. Several reviews and high quality studies have revealed that participation in aerobic physical activity can improve executive functions in specific populations (e.g., low-active, overweight and obese children) 7 8 . Concomittantly, the qualitative characteristics of physical activity (e.g., type of exercise including task novelty, cognitive complexity, mental strategies activated, movement coordination) may act to increase brain stimulation 9-11 . For instance, preliminary evidence shows positive effects of cognitively engaging physical activity for boosting cognitive functioning 12 . Executive functions are especially fundamental for success in school, and every day aspects of work and life 11 . Core executive functions include self-regulatory cognitive operations, such as inhibition (maintaining focus), working memory (retaining and manipulating information), and cognitive flexibility (multi-tasking) 11 . Despite the well-established health benefits of physical activity, declining levels in children globally are alarming 13 14 .
Outside of physical education, sport, recess, and lunch breaks, additional opportunities for physical activity during the school day can include: i) non-curriculum focused classroom breaks, or short bouts of physical activity performed as a break from academic instruction time (also referred to as energiser breaks or activity bursts 15 ), and ii) curriculum-focused physical activity. The latter include curriculum-focused physical activity breaks that contain short bouts of physical activity that include curriculum content 16 (e.g., activity breaks combined with mathematics 17 , and physically active academic lessons that involve the integration of physical activity throughout academic lessons in key learning areas 18 (e.g., active mathematics lessons 19 ). Curriculum-focused active breaks and active lessons differ in terms of relevance and purpose of the task. Active breaks typically include physical activity of short duration (e.g., up to 10 min) before or between academic content. Alternatively, in physically active lessons, physical activity is integrated into learning activities (typically lasting the duration of lessons; e.g., 45 min). High degrees of body engagement during the learning process may help enhance learning and information retention 20 21 . The advantage of the integrated approach is that information can be processed simultaneously in different sub-systems, and hence, release working memory resources 22 .
Growing evidence affirms the positive effects of physically active lessons on children's educational outcomes 16 23 24 . Results from a recent meta-analysis suggest that participation in physical activity appears to have a positive effect on students' engagement in the classroom, often reflected in improvements in 'time on-task' and concentration (ES = 0.77) 3 . Student engagement, often referred as 'on-task behavior' and 'time-on-task', is a prerequisite for academic success and for promoting learning by minimising classroom disruptions such as fidgeting or inattention (i.e., being "off-task" [25][26][27] ). For example, physically active lessons have showed a 72% improvement on children's on task behavior. Also, after 2 years of implementation of three physically active lessons per week in second and third-grade classes from 12 elementary schools (e.g., jumps to solve multiplication questions), greater learning benefits were revealed in mathematics and spelling 28 .
Of note, the Programme for International Student Assessment has reported declining literacy scores across schooling years in several countries including Australia 29 . In general, about one quarter of Australian students achieve literacy scores at or below the minimum standards 30 31 . In response to the increased national focus on improving literacy skills, there has been a recent emphasis on improving instruction in primary school English lessons. Considering that one-fifth of the working-age population has low literacy levels 32 , enhancing these levels by one skill level may lead to increased likelihood of employment of 2.4-4.3% and an increase in wages of 10% 33 .
Integrating physical activity in the English content may have promising physical and academic benefits by improving the learning experience. Innovative strategies are needed in the design and delivery of literacy programs, potentially reinvigorating English lessons by enhancing children's ontask behavior, motivation, and learning outcomes. Furthermore, integrating physical activity in English lessons could address the lack of time available or devoted to physical activity during the school day, especially given that time is consistently reported as an implementation barrier by teachers 34 . The increased foucs on physical activity has the potential to simultaneously bring about benefits both for children's physical wellbeing and their literacy scores.
We recently conducted a feasibility trial that was delivered by research staff in a single school for four weeks, and showed that the Thinking While Moving in English (TWM-E) program shows promise of improved children's on-task behavior and spelling scores 35 . The current efficacy trial was informed by that feasibility trial, but instead the research team trained classroom teachers to deliver the intervention after receiving professional learning. The primary objective of this cluster randomized controlled trial was to determine the effect of the TWM-E intervention on children's physical activity levels. Secondary objectives were to examine whether the TWM-E intervention improves on-task behavior, literacy skills (spelling, grammar and punctuation), and executive functioning (inhibition and task shifting).

Study design
The TWM-E cluster randomized controlled trial (RCT) was registered with the Australian and New Zealand Clinical Trials Registry (ACTRN12618001008213). Government schools were randomly selected within a 60-km radius from the University of Newcastle (e.g., Hunter, Central Coast, Newcastle regions). Written consent forms were received from school principals, teachers, and parents. Data collection occurred between April and September 2018. The design, implementation and reporting of the TWM-E study complied with the Consolidated Standards of Reporting Trials guidelines for clustered RCTs 36 . Detailed study methods are reported elsewhere 37 .

Randomisation
Schools were the unit of randomization. After receiving written consent, participating schools were matched by size and demographic characteristics based on the schools' Index of Community Socioeducational Advantage 38 , using a measure of relative advantage/disadvantage based on geographic area in Australia. Schools were randomized into experimental and waitlist control conditions after the baseline assessments using a computer-based algorithm by an independent researcher. Participants A total of 283 Grade 3 and 4 primary school students (Mage = 9.81, SD = 0.68) and their teachers (N = 12) who were willing to deliver physically active lessons, were recruited from 9 primary schools (each school contributed one class -apart from one control and one intervention schools which had two classes). Ethics approval was obtained from the University of Newcastle, New South Wales (NSW),

Power Calculation
Power analysis using procedures appropriate for a RCT study design 39 40 were conducted to determine the sample size required to detect changes in the primary outcome of accelerometerdetermined physical activity. Calculations assumed baseline to post-test correlation scores of r = 0.30 and were based on 80% power and alpha level 0.05. Based on the reported physical activity effects (i.e., SD change = 200 counts per minute) after six weeks of the "Thinking While Moving in Maths" study (aka EASY Minds) pilot study and an intra-class correlation coefficient (ICC = 0.15), a study sample of N = 200 with 8 clusters (i.e., schools) of 25 students would provide adequate power to detect a between group difference of 200 counts per minute across the school day 19 40 . We initially intended to include Actigraph accelerometers in the study, but due to lack of access to these, we used Axivity instead. Hence, the counts per minute power calculation was not relevant.

Intervention
The TWM-E program supported classroom teachers to adapt their English lessons to include movement-based learning components and to deliver these lessons over a 6-week period (3 × 40 min lessons per week). The recommended lesson content was generated from the NSW K-6 English syllabus 41 . Participating teachers received a 1-day professional learning workshop, as well as equipment and resources for the activity components in the lessons (e.g., chalk, buckets, balls, whiteboards, drill ladders, skipping ropes, lettered bean bags, and lettered flexi-domes -value $400 AU), and mentoring of the research team in the project (including 3 face-to-face school visits and observations).
The professional learning workshop provided the rationale for physical activity integration, presented the results of the feasibility trial, and offered practical examples of physical activity integration (i.e., online videos), access to English curriculum expertise and peer-supported planning sessions 37  During the intervention, classroom teachers were responsible for the planning and delivery of all movement-based lessons. They were supported through weekly emails, answering possible questions and suggesting solutions for issues arising. The research team also provided feedback and advice stemming from face-to-face observations of the active English lessons (i.e., 40 minutes). English lessons in both intervention and control groups occurred during the usual timetable slot (e.g., 9:00-11:00 am). The control group followed their usual practice (i.e., normal curricular lessons) for the duration of the study period. Schools in the wait-list control condition received the professional learning workshop at the end of the post-intervention assessments in September 2018.

Measures
Baseline assessments took place in April-June 2018 and the post-intervention assessments in September 2018. All study measures were conducted in the schools by trained research assistants who were blinded to the group allocations at baseline. The same research assistants were used for both time points (baseline and post-intervention). However, it was not possible to blind assessors to treatment allocation at follow-up as the physically active lessons occurred during regular lesson time when data collection took place. Consenting students completed the assessments under exam-like conditions following a verbal explanation from a research assistant. Demographic information (i.e., age, sex, language spoken at home) was collected via a student questionnaire at baseline.
Primary outcome: Physical activity during the school day was measured using tri-axial wrist-worn accelerometers AX3 (Axivity, York, UK). Wrist-worn Axivity monitors have been found to have high equivalence and agreement regarding acceleration, sedentary, light and moderate-to-vigorous intensity of physical activity in adults compared to GENEActiv and Actigraph GT9X 44 . Accelerometers were worn for five consecutive school days (i.e., Monday to Friday) from 9:00 am to 3:00 pm. Data were downloaded in raw format using the OmiGui Software and processed in R software (http://cran.rproject.org/) using the software package GGIR 45 . Non-wear time was classified within a 60 min time window if for at least two out of the three axes, the standard deviation was less than 13 mg and the value range is less than 50 mg 46 . Data were reduced by calculating the average gravity-based acceleration units (g) per 1-s epoch, with daily time spent in moderate-to-vigorous physical activity (MVPA) determined using the sum of epochs averaging above 201 mg 47 . The average minutes spent in MVPA per day and average daily wear time were computed using data from each participant's valid days. Valid days were defined as more than five school hours on any given day 48 , for at least 3 days 49 .
Secondary outcomes: On-task behavior during English lessons was estimated as a percentage of time using a momentary time sampling adapted by Riley and colleagues 50 from the "Behaviour Observation of Students in Schools" and the "Applied Behaviour Analysis for Teachers" 51 52 . On-task behavior is categorized as "active engagement", defined as the time a child is actively engaged in an academic activity such as reading, writing, or performing the designated task), or "passive engagement" such as sitting quietly, sitting quietly absorbing the information but not actively engaged in the activity. Off-task behavior is defined as behavior that is not associated with the task, and classified as off-task motor such as walking around the class, off-task verbal, such as chatting, or off-task passive such as looking around in the class 19 53 .
Using a random number-producing algorithm, 12 students per class (6 males, 6 females) were randomly selected based on the alphabetical class roll. Observations occurred at both time points (baseline and post-test) by two trained research assistants. Observations were conducted at the end of 15-sec intervals for 30-min in the allotted English time slot (i.e., 9:00-11:00 am), with each student's behavior coded as on-task (actively engaged or passively engaged) or off-task (off-task verbal, off-task motor or off-task passive) at the time. At the end of the following 15-sec interval, the next student's behavior was coded. Observers listened to an audio file via headphones, which informed them when to observe and record by circling an appropriate code (i.e., actively engaged, passively engaged or off-task) using an observation sheet. This process was repeated until each of the six students were observed 20 times.
During the actual study period, students were aware of the presence of the research team in the class, but did not know the purpose of their visit. Observers stood at the back of the classroom in order to minimise their influence the student attentiveness. We did not establish an interrater reliability for this study. Instead, we sought to assess the maximum number of students in each class.
However, our research team has previously established an intraclass correlation coefficient of 0.84 for the same on-task behavior assessments (Mavilidi et al., under review).
Literacy attainment was measured using the standardized "Progressive Achievement Test", following the Australian Council for Education Research recommendations 54 . Children were assessed on written spelling (30 items) and grammar and punctuation (35 items). The test was administered by the regular classroom teacher and children were given a maximum time of 30 minutes for each assessment.
Executive functioning was measured using validated tests from the National Institute of Health Toolbox for 7-17 years 55 56 delivered on tablet devices. The flanker task examines inhibitory control ability. Participants are asked to respond whether the central arrow of a multi-arrow display is pointing left or right, using index fingers of left/right hand. The flanking arrows are either congruent (i.e., pointing in the same direction as the central arrow, →→→→→), or incongruent (i.e., pointing in the opposite direction to the central arrow, (→→←→→). Children completed four practice and twenty test trials, with the test lasting approximately 3-5 minutes. More accurate and faster responses are produced for congruent than incongruent trials 57 58 .
The dimensional change card sort test examines set-shifting ability (i.e., the ability to switch between different sorting rules). Participants have to sort pictures according to one of two dimensions (e.g., shape and colour), and use explicit cues (the words 'shape' or 'colour') to shift between sorting rules on successive trials. Children completed three practice trials and test duration was approximately 4-6 min. Both tests were scored based on children's accuracy and reaction time. When accuracy levels were less than 80%, the accuracy and reaction time were combined. For scores higher than 80%, the final score was equal to the accuracy score 59 . Higher scores indicate better performance.

Process Evaluation
The feasibility, adherence and satisfaction of the TWM-E program was assessed through: i.
Post professional learning workshop questionnaire: Teachers responded on a 5-point Likert scale, ranging from 1 (strongly disagree) to 5 (strongly agree), regarding their perception of the skills acquired from the training, the satisfaction and quality of the training, and their confidence to deliver movement-based English lessons. ii.
Fidelity (session quality): in Weeks 2, 4, and 6 active English lessons were observed by the research team and assessed on developing English concepts (3 items; "Movements aided and promoted learning"), physical activity levels (3 items; "Equipment used promoted physical activity"), and students' engagement (3 items; e.g., "Students were engaged by the activities taught") using a 5-point Likert scale ranging from 1 (Not at all true) to 5 (Very true) . iii.
Post-program student satisfaction: students responded regarding their perceptions of physically active English lessons using a 9-item questionnaire, with a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree).

Statistical Analyses
Statistical analyses were conducted using IBM SPSS (version 24) and alpha level was set at p < 0.05.
The outcomes were analyzed using linear mixed models, which are (i) consistent with the intention-totreat principle, (ii) robust to the biases of missing data and (iii) provide appropriate balance of Type 1 and Type 2 errors 60 61 . Considering the hierarchical structure of the data (e.g., students nested within classes and schools), multilevel modelling analyses were used to analyse all outcomes 62 63 .
More specifically, the models were adjusted for the clustering at class level. In the current study, school-level clustering was negligible after accounting for clustering at the class level, also supported by previous research 64 . The results focus on the group-by-time effects, i.e., the interaction between Group (TWM-E, control) and Time (post-test, baseline).

Overview
A summary of the demographic characteristics is presented in Table 1. The majority of the participants were from an Australian cultural background (94.6%) and spoke English at home (97.3%).

Primary outcome Physical activity
There was no group-by-time effect for moderate, moderate-to-vigorous, and vigorous intensity physical activity levels (see Table 2).  Note: M = Mean; CI = Confidence intervals; CON = Control group; * Intention-to-treat linear mixed models were used; Adj. diff. in change = mean difference and 95% CI between TWM-E -CON after 6 weeks (Intervention -Control) -adjusted for baseline scores; ICC = intraclass correlation coefficient. The highest possible scores were: 100 for on-task behaviour and executive function, 30 for spelling and 35 for grammar and punctuation.

Secondary Outcomes
A summary of intervention effects on academic outcomes is presented in Table 3.
Significant group-by-time effects were observed for both on-task and off-task behavior in favor of the TWM-E group. Specifically, the TWM-E group increased actively engaged behaviors, and reduced both passively engaged behaviors and off-task behaviors. Significant group-by-time effects also in favor of the TWM-E group were observed for grammar and punctuation, but not for spelling. There were no significant group-by-time effects for either executive function task, i.e., inhibitory control and setshifting ability.

Discussion
The primary aim of study was designed to evaluate the effectiveness of physically active English lessons on primary school students' physical activity levels. We did not find any significant intervention effects for physical activity outcomes. However, significant intervention effects were seen for on-task behavior and literacy attainment (i.e., grammar and punctuation). Finally, we did not see any intervention effects on spelling, or executive function. These findings indicate that after participating in the TWM-E lessons students: (i) were more engaged during lessons, and (ii) experienced greater improvements in literacy compared to the control condition.
Contrary to our hypothesis, we did not find a significant increase in children's physical activity intensity levels. The majority of research on physically active lessons has examined moderate to vigorous intensity of physical activity 65 66 . For instance, a previous study on active learning in children at the same age group as this study (i.e., attending 4 grade) did not find changes in minutes of moderate-to-vigorous physical activity reported via accelerometery 65 , but did find significant changes in number of steps. Other studies on physically active lessons have reported significant improvements in primary school children's physical activity levels as measured by steps 35 53 , accelerometers 66 19 , and heart-rate monitors 67 . It is worth noting that, when assessing interventions involving physically active lessons on students' physical activity, we need determine if the intervention has been implemented as per protocol and that there is no displacement in physical activity when children are more active during class time.
In the present study, children who participated in the active English lessons spent more time on-task than the sedentary control group. More specifically, there was a substantial increase in children's active engagement and a corresponding decrease in off-task behavior. This is important, as on-task behavior is a key predictor of academic success 68 . Previous intervention studies using physically active lessons have also demonstrated significant improvements in primary children's on-task behavior 19 65 69 . This is consistent with a recent meta-analysis showing that physical activity interventions produce improvements in classroom behavior (ES = 0.77) 3 . In addition, Owen et al.
showed that integrating physical activity to classroom lessons had no effect on school engagement Contrary to previous research 35 67 , we did not find any difference on children's spelling scores. This could be due to the relatively short duration of the intervention. It is also possible that the physical activities selected by the teachers focused more on grammar and punctuation skills, or that teachers did not provide sufficient feedback on correct spelling to children after performing the physically active spelling activities.
Alternatively, standardized tests for spelling may not able to capture changes in students' progress over a short time-frame. Future research should focus on targeting activities that explicitly improve spelling skills if that is an area of concern with classroom teachers. Nevertheless, considering the short study duration, this intervention offers promising results on learning outcomes and in particular grammar and punctuation.
Lastly, no intervention effects were observed regarding children's executive functions. Currently, there is a scientific debate regarding the effects of physical activity on executive functions. In a recent review, Diamond and Ling 11 stated that exercise can enhance executive function ability only when it specifically involves practice of executive functioning skills. Thus, physical activities such as martial arts or yoga that explicitly train diverse executive functions can produce more prominent cognitive benefits than treadmill running, or stationary biking. In contrast, Hillman and colleagues have argued that aerobic exercise that improves cardiorespiratory fitness can enhance activation in brain areas that support executive functioning and high-order thinking (e.g., motor cortex, cerebellum, basal ganglia) 7 as well as executive function ability 72 .
Current evidence suggests the benefits of high intensity physical activity on fitness, and mental and cognitive health 73 74 . Hence, it is likely that the volume and intensity of physical activity implemented in the present interventions were insufficient to improve children's physical fitness and in turn, elicit substantial neurobiological and cognitive changes. Animal studies suggest that improvements in cardiorespiratory fitness are necessary to induce neurogenesis 75 . Particularly, strenuous aerobic exercise (e.g., heart rate reaching 70-85% 76 or around 120 bpm in adolescents 77 and 160 bpm in children aged 9-10 years 9 78 ), appears to be required to improve executive functions.
Chronic interventions targeting cognitively engaging physical activity have shown improvements in children's executive function skills 79 80 . For instance, a study in children 7-9 years involving cognitively-engaging physical activity breaks revealed delayed effects in cognitive (i.e., shifting) and learning (i.e., mathematics) performance after 20 weeks 81 . Moreover, a recent study integrated physical activity with language learning in primary school children. The intervention occurred 10-min per day, twice per week for two weeks. Although children's learning improved after the end of the intervention, no acute effects were found on attentional performance which, in fact, deteriorated 82 .
In addition, children's self-reported level of cognitive exertion did not vary between the condition that integrated physical activity with learning and the control condition.

Educational implications
Overall, the TWM-E intervention delivered over a 6-week interval improved on-task behavior and academic performance. The program components (delivery and content) were well-received by teachers and students, showing the program's feasibility and potential sustainability for introducing movement-based English lessons in primary school students, and their potential to become part of the regular practice. Importantly, the post-program evaluation questionnaire revealed that children rated the program as very enjoyable, consistent with evidence from other studies that students consider integrating physical activity an enjoyable teaching method of several learning domains [83][84][85][86][87][88] .
High levels of perceived competence and need satisfaction can result in increased self-efficacy (i.e., confidence in someone's ability to perform a task), and inherent motivation, and in turn, improved academic performance 89 . In fact, positive mood can improve academic outcomes with regard to engagement and achievement 90 91 , with motivated students showing higher engagement in lessons and obtaining higher grades 92  In addition, although Axivity monitors have been shown to be reliable measuring physical activity in over 100,000 participants 94 , with primary school aged children, most research uses waist-worn Actigraph accelerometers 19 95 , while the one study that used Axivity Ax3 accelerometers had them mounted with tape on the waist 96 . It is suggested that when using Axivity accelerometers in children, a dual-accelerometer system with sensors placed on the thigh and the back should be included for greater accuracy 97 . However, this may not be a viable option for school-based research in young children. Wrist-worn Axivity monitors have been found to have high equivalence and agreement regarding acceleration, sedentary, light and moderate-to-vigorous intensity of physical activity in adults compared to wrist-straps GENEActiv and Actigraph GT9X 44 . Nonetheless, it is important to note the on-going debate regarding wrist-worn and waist-worn accelerometetry, with significant and substantial differences depicted in counts per minutes across all intensities 98 .

Conclusions
The benefits of physical activity for children's physical, social, psychological development are wellestablished. The TWM-E intervention presents a feasible and practical approach to increase engagement in primary school children, with potential physical activity benefits for combatting the declining levels of physical activity in children. Importantly, this study shows that physically active lessons can enhance children's literacy attainment, key area requiring urgent policy and practice amendments by stakeholders.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.