Interlimb Coordination of Rhythm and Beat Performance

Interlimb coordination is critical to the successful performance of simple activities in everyday life and it depends on precisely timed perception-action coupling. This is particularly true in music-making, where performers often use body-movements to keep the beat while playing more complex rhythmic patterns. In the current study, we used a musical rhythmic paradigm of simultaneous rhythm/beat performance to examine how interlimb coordination between voice, hands and feet is inuenced by the inherent hierarchical relationship between rhythm and beat. Sixty right-handed participants—musicians, amateur-musicians and non-musicians—performed three short rhythmic patterns while keeping the underlying beat, using 12 different combinations of voice, hands and feet. Results revealed a bodily hierarchy with ve levels 1) left foot, 2) right foot, 3) left hand, 4) right hand, 5) voice, implying a more precise task execution when the rhythm was performed with a limb occupying a higher level in the hierarchy than the limb keeping the beat. The notion of a bodily hierarchy implies that the role assigned to the different limbs is key to successful interlimb coordination: the performance level of a specic limb combination differs considerably, depending on which limb holds the supporting role of the beat and which limb holds the conducting role of the rhythm. Although performance generally increased with expertise, the evidence of the hierarchy was consistent in all three expertise groups. The effects of expertise further highlight how perception inuences action: Embracing a predictive coding view, we discuss the possibility that musicians’ more robust metrical prediction models make it easier for musicians to attenuate prediction errors than non-musicians. Overall, the study suggests a comprehensive bodily hierarchy, showing how interlimb coordination is inuenced by hierarchical principles in both perception and action.


Introduction
Interlimb coordination is critical to the successful performance of simple activities in everyday life, such as walking or performing bimanual tasks. Such activities depend on precisely timed perception-action coupling 1 . In many cases, interlimb coordination is also guided by a hierarchical conduct-support relationship between the limbs, mirroring the inherent conduct-support relationship of an integrated task.
For instance, the primary goal of peeling an apple is realized by the dominant hand while the nondominant hand holds the supporting role of holding the apple 2 . This horizontal dimension characteristic of bimanual interlimb coordination has been extensively studied [2][3][4] . In the present study, we examined how the conduct-support relationship of an integrated musical task in uences human interlimb coordination, also in the vertical dimension. Through the use of a musical rhythm paradigm, we utilized the inherent hierarchical structure of rhythm and beat to assign conductive and supportive roles to separate limbs (in congruent and incongruent combinations). This enabled us to provide a comprehensive account of how interlimb coordination is governed by hierarchical principles present in both action and perception.
Music is particularly well-suited for studying timing-related aspects of perception/action coupling 5 . When listening to music, we automatically extract rhythmic regularities, most saliently the underlying beat, and we group and nest these regularities to form a hierarchical metrical structure that is used as a framework for interpreting the perceived sound events [6][7][8][9][10] . Body movements like foot-tapping or nodding play a facilitatory role in distinguishing the lower level regular beats from the more complex musical rhythm 11 .
Expressed in the gure-ground terminology of the Gestalt tradition 12 , body movements reinforce the perceptual ground on which the rhythmic gure is interpreted. The hierarchical aspects of rhythm perception have recently been coupled with modern theories of fundamental brain function such as the theory of predictive coding 13 . In this view, the metrical structure forms a predictive model that is constantly shaped and challenged by produced or auditorily perceived rhythms engaging higher levels of the brain hierarchy. As such, when playing or listening to music, synchronizing body movements to the beat underlines and reinforces the lower level of the predictive model, and thereby facilitates reduction of prediction errors.
Synchronizing body movement to musical beats depends on the unique human ability to extract regular beats from complex rhythms. It is a trait that appears to be innate in humans 14 -or in any case develops spontaneously -and tapping along with a beat is relatively easy for most people, musically trained or not [15][16][17][18] . A considerable amount of the research on sensorimotor synchronization is represented by tapping studies in which participants were asked to tap to with either a rhythm or its underlying beat 11,[19][20][21][22] . In comparison, little is known about simultaneous performance of beat and rhythm. One study 23 showed that simultaneous performance of rhythm and beat with combinations of hands and feet is governed by a so-called "order of rhythm dominance". This corresponds to a bodily hierarchy consisting of four levels: 1) left foot, 2) right foot, 3) left hand, 4) right hand, where right-handed participants prefer to perform the rhythm with a limb occupying a higher level in the hierarchy than the limb keeping the beat. The hierarchy implies different levels of di culty in the coordination of two limbs, depending on whether their respective roles of 'rhythm performer' and 'beat keeper' are in accordance with or go against the bodily hierarchy. The study also found effects of musical expertise, where coordinating two limbs in simultaneous rhythm and beat production was more di cult for non-musicians than for musicians.
In music, the voice is perhaps the most versatile means to produce sound and rhythm. This would imply a dominating position in the bodily hierarchy. As a case in point, vocalizing for instance the rhythm of the famous musical phrase "We're Sergeant Pepper's Lonely Hearts Club Band" (The Beatles, 1969) while clapping in time with the underlying beat is markedly easier than reversing the roles of the hands and the voice, i.e., vocalizing a regular beat (da-da-da-da) while clapping the more complex rhythm. Therefore, we propose a comprehensive bodily hierarchy consisting of both voice, hands and feet ( gure 1). Several studies on verbal and manual coordination exist, but they primarily investigated production of sentences or very simple, repetitive rhythmic patterns, often with the hands tapping as fast as possible [24][25][26][27][28] . In our study, the inherent hierarchical properties of rhythm and beat creates a coordination task where the two components of an integrated task -performing a rhythm and keeping a beat, respectively -can be performed on equal terms by both voice, hands and feet. This enables us to study the differences between moving with or against the proposed hierarchy.
To assess whether this bodily hierarchy re ects the hierarchical structure of rhythms in music, i.e., with lower levels in the bodily hierarchy representing lower levels in the metrical structure and vice versa, we asked participants to use their bodies as a musical instrument. We adapted and re ned the basic paradigm used by Ibbotson and Morton 23 . Musicians, amateur-musicians and non-musicians were included in order to assess putative effects of expertise, as well as a second measure of musical competence (The Musical Ear Test) 29 . Using the terminology by Ibbotson and Morton, the term 'dominant' will henceforth be applied to combinations going with the hypothesized hierarchy, and 'non-dominant' will be applied to combinations going against it. Assessing how musical expertise affects participants' abilities to coordinate body parts allowed us to obtain nuanced insight in how perception in uences action in the form of interlimb coordination and dual motor tasks.

Methods And Materials
Participants 63 participants were recruited for the study. Three were excluded, as they were unable to perform the rhythmic patterns in isolation. This left 60 participants: Twenty musicians (mean age = 23.8, SD = 2.40, 7 female), twenty-two amateur-musicians (mean age = 24.36, SD = 1.68, 13 female) and eighteen nonmusicians (mean age = 25.06, SD = 3.94, 12 female). The musicians were actively performing professionals or conservatory students, the amateur-musicians performed music at a hobby level and the non-musicians had no more than one year of formal music education besides the mandatory music classes in primary school. All participants were right-handed as assessed by the Edinburgh Inventory 30 . No attempt was made in order to balance out for gender, since former studies have shown that musical competence generalizes across gender 29 . The study was performed at Center for Music in the Brain, Dept.
of Clinical Medicine at Aarhus University in accordance with the guidelines and regulations for behavioral studies in force at Aarhus University. No personal sensitive information was solicited from participants, and informed consent was obtained for all participants at the beginning of each session.

Paradigm
The experimental paradigm consisted of three subtests, each with a different rhythmic pattern assigned to it ( gure 2). The rhythmic patterns were designed to be of different complexity (low, medium and high), based on the amount of syncopes 7 . The participants' task was to produce the rhythmic pattern of the given subtest using one limb while maintaining a regular beat with another limb. While the hands and feet produced sound by tapping, the participants were instructed to vocalize the syllable 'da', when using their voice. Participants used combinations of ve different limbs to perform the rhythm/beat tasks: voice (V), right hand (RH), left hand (LH), right foot (RF) and left foot (LF). Henceforth, when referring to a combination pair, a plus sign will be used, i.e., RH+LF.
The combinations including the voice were of primary interest, as all combinations of hands and feet had already been studied previously 23 . For the purpose of replication, we also tested RH+LH and RH+LF. Thus, 6 combination pairs were included: V+RH, V+LH, V+RF, V+LF, RH+LH, and RH+LF, each in their dominant and non-dominant version respectively. Procedure A training session preceded each subtest, ensuring that participants were able to perform the rhythm with each limb in isolation, so that differences in performance would not be due to motor de ciencies. The experimenter tapped the beat, while the participant performed the rhythmic pattern consecutively with each limb, and the training session ended when the participant felt comfortable with the rhythmic pattern.
The order of the subtests was the same for all participants (low -high -medium), but two different orders of the combinations were counterbalanced between the participants (limb combinations are written as rhythm/beat): To ensure that the tempo of the performances would be comparable, the participants wore headphones with a click track playing a regular beat at 90 bpm (i.e., with a beat periodicity of 667 ms). The volume was adjusted to a comfortable level before the experiment, and they were instructed to make their beat limb correspond to the click track.
The participants had two trials per combination and their performance was not evaluated during the experiment. Once the click track started, they were allowed a few measures to adjust to the beat before or between trials. They were also allowed to start tapping the beat a few measures before performing the rhythm. It was emphasized, though, that they were not allowed to practice with both limbs before or between trials.
To measure rhythmic competence, the participants completed the rhythmic part of the Musical Ear Test [19]. The test consists of 52 pairs of short rhythmic phrases, and the participants judged whether the pairs were identical or not. The test is auditory only and thus gives a measure of the participants' perceptive abilities.

Equipment
All trials were audio recorded in stereo with one microphone for the rhythm limb and one for the beat limb. The microphones were moved according to the active limbs during the trial in question. One microphone was handheld by the experimenter, while the other was mounted on a stand. A TC Electronics Konnekt 8 soundcard and the program Reaper 64 were used for the recordings.

Evaluation
The participants performed each combination twice, and the best was selected for evaluation. The recording of each combination was pseudo-anonymized and then evaluated by both an experimenter and an external rater, educated at conservatory level.
Each combination was given a grade between 1 and 5 from the following principles: 1) the combination was not accomplished, 2) the combination was almost accomplished, but with some mistakes, 3) the combination was accomplished, but somewhat imprecise compared to the metronome, 4) the combination was accomplished, but one or two beats were imprecise compared to the metronome, and 5) the combination was accomplished without mistakes. The nal score for each combination was the average of the grade given by the experimenter and the external reviewer. This will be referred to as the Rhythm Score.

Inter-rater reliability
Inter-rater reliability between the two raters was calculated using an average-measures, absoluteagreement, two-way mixed-effects model. The resulting intra-class correlation (ICC) was within the excellent range, ICC = 0.97 (95% CI [0.953;0.979], p < 0.001), indicating that the raters had a high degree of agreement. The high ICC suggests that a minimal amount of measurement error was introduced by the two independent raters, and therefore statistical power for subsequent analyses is not substantially reduced. An average of the two raters' evaluation was therefore deemed to be a suitable score for use in the hypothesis tests of the present study.

Statistical approaches
The average Rhythm Score was confounded by a large number of participants, mainly musicians, who were able to complete almost all combinations equally well. Speci cally, 22 participants (13 professional musicians and 9 amateur musicians) obtained an average Rhythm Score of 4.9 or above in either the dominant or the non-dominant combinations, or both. Hence, two approaches to the analysis were carried out: One using the Rhythm Score as the outcome variable, but excluding the 22 ceiling performers, leaving 38 for analysis; and one using the binomial outcome variable Accuracy, that was derived from the Rhythm Score, indicating whether a combination was accomplished or not. Rhythm Scores of 2.5 and above were deemed as accomplished. All participants were included in these analyses.
Initially, comparisons of Accuracy in the non-dominant and dominant version were made at each different limb combination using the mid-p version of McNemar's exact conditional test for paired binary observations 31 , to make an initial assessment of the hierarchy. Then two different models, each with a different format, were constructed: The rst one tested the hypothesis that the voice occupies the top of the hierarchy by tting a generalized linear mixed-effects model (GLMM) on the binary outcome "Accuracy". The combinations included in this model were limited to those involving the voice. The xed factors included in this analysis were 1) Group (between-subject: Non-musician, amateur-musician and musician): Increased Accuracy for amateur-musicians and musicians respectively is expected as compared to non-musicians, 2) Rhythm Complexity level (within-subject: low, medium and high): Increased Rhythm Complexity is expected to affect Accuracy negatively, and 3) Direction (within-subject: dominant or non-dominant): Increased Accuracy for the dominant combinations is expected and would con rm the hypothesized hierarchy. Random intercepts for participants were included as well as byparticipant random slopes for the effect of direction, which accounted for inter-individual differences in the effect of working against the hierarchy. The second tested the hypothesis that the effect of direction depended on the expertise level of the participants by constructing a linear mixed-effects model (LMM) including also non-voice combinations, and tting it on the average Rhythm Score of the remaining 38 participants. Due to the vast reduction of the groups of musicians and amateur-musicians, the group factor was replaced by the MET-score, which served as an index of musical expertise. Accordingly, the xed factors in this analysis were 1) MET-score (between-subject): An increased MET-score is expected to affect Rhythm Score positively, 2) Rhythm Complexity level (within-subject: low, medium and high, medium and complex): An increased level of rhythm Complexity is expected to affect the average Rhythm Score negatively, and 3) Direction (within-subject: dominant or non-dominant): An increased Rhythm Score for the dominant combinations is expected and would again con rm the hypothesized hierarchy.
The random effects were similar to the GLMM's constructed previously. The possibility of an interaction effect between MET-score and direction was of particular interest in this model. All analyses were conducted using the lme4 package 32

Results
The difference between Accuracy in the dominant and non-dominant versions of each limb combination were highly signi cant in all but the RH+LH condition; here, only the high complexity rhythmic pattern yielded a signi cant difference (table 1), while the others only showed a tendency. Accuracy was also affected signi cantly by both Complexity, χ 2 (2) = 122.86, p<0.001, and Group, χ 2 (2)=62-01 p<0.001, as expected. Planned contrasts in the Group variable showed that compared to the amateur-musicians, the log odds of completing were signi cantly lower for non-musicians (b = -6.59, z = -7.12, p<0.001), and higher, but not signi cantly, for musicians (b = 0.63, z = 0.74, p = 0.457). Planned contrasts in the Complexity variable showed that compared to rhythm C, the log odds of completing were signi cantly lower for rhythm B (b = -1.27, z = -4.05, p < 0.001), and signi cantly higher for rhythm A (b = 2.68, z = 6.12, p < 0.001). See gure 3.
A linear mixed-effects model was tted on Average Score with MET, Direction and Rhythm as xed factors. Random effects were the same as the GLMM.
Moreover, there were signi cant interaction effects of the MET-score and the Direction, χ 2 (1) = 7.2939, p=0.007. This result should be taken with caution, however, since the effect was most likely driven by a ceiling effect in the low complexity subtask ( gure 4).

Control experiments
Two control experiments were performed after the original data collection. Control experiment 1 used 18 musicians from the original sample to con rm the hierarchical organization of hands and feet by selfassessment and to control for the possibility of the hierarchy only pertaining to the three rhythms used in this study. The musicians' task was to keep the beat with one limb and improvise rhythmically with another, in dominant and non-dominant versions of the combinations V+RH, RH+LH, LH+RF and RF+LF.
The order of the combinations was counterbalanced between subjects. After each combination pair, participants were asked if they found the dominant or non-dominant version easiest.
Control experiment 2 used 19 musicians from a different sample in connection with a separate study to control for the possibility of the hierarchy being caused by dexterity differences between limbs. The participants performed a rhythmic pattern and its corresponding beat in both directions of the combination pairs Clap+RF, V+Clap and RH+LH. This rhythmic pattern had longer IOI's than the corresponding beat ( gure 5). After each trial, the musicians were asked to rate how easy they found the task on a scale from 0 to 100. The order of the combinations was counterbalanced between subjects.

Control experiment 1
Results convincingly con rm the proposed hierarchy and are seen in gure 6. Out of 18 musicians, the dominant combination was preferred by 18 in V+RH, 16

Discussion
Based on these results, we suggest a comprehensive bodily hierarchy governing interlimb coordination during simultaneous performance of rhythm and beat with two different limbs. From lower to higher levels, the hierarchy incorporates 1) left foot, 2) right foot, 3) left hand, 4) right hand, 5) voice. Participants' execution of the musical rhythm/beat tasks was more precise when the supporting role of keeping the beat was undertaken by a limb occupying a lower level in the hierarchy than the limb performing the rhythm. In combinations of hands and feet, we observed better performance when beetkeeping was assigned to a foot while the rhythm was performed using the hands. Yet, when the supporting role of beat-keeper was assigned to the voice, it was strikingly di cult for the participants to perform the rhythm with the hands, as indicated by signi cantly better performance in the opposite combination. In bimanual executions of the task, we observed better performance when beat-keeping involved the left than the right hand in the high complexity rhythmic pattern. This difference, however, was only signi cant with the high complexity rhythmic pattern.
While performance generally increased with musical expertise, this hierarchical pattern was consistent across groups of non-musicians, amateur musicians, and professional musicians. A subsequent control experiment in which the rhythm pattern had longer IOIs than the corresponding beat pattern ruled out the possibility that dexterity accounts for the results and suggested a precedence of the vertical over the horizontal axis. A follow-up experiment in which participants improvised with one limb while keeping a steady beat with another con rmed the bodily hierarchy and extended the results to improvisational behavior. Taken together, our results suggest a comprehensive bodily hierarchy of interlimb coordination with a vertical axis preceding a horizontal axis as illustrated in gure 7.
By using musical rhythm, we were able to create coordination tasks including two separate actions (performing rhythm and keeping the beat) that could be performed on equal terms by hands, feet and voice, thereby allowing the roles of the limbs to interchange. Previous studies on verbal/manual coordination have used either fundamentally different tasks for voice and hands, for instance speaking and tapping 26,33,34 , or exactly similar tasks, i.e., identical rhythmic patterns with the body and voice 27,35 .
These studies found interference due to dual-tasks in the rst examples and mutual stabilization of the two body parts in the latter.
According to Kinsbourne and Hicks 24 , performing two different tasks simultaneously without losing e ciency on the main task requires a high degree of automatization in one of the tasks. Our study, which is based on the musicological dichotomy between rhythm and meter/beat, revealed that this assumption depends on which limb performs which action. The possible stabilizing effect 35 only appears if the supporting role, i.e., the beat, is maintained by a lower level of the bodily hierarchy while a higher level maintains the conducting role, i.e., the rhythm. Even to musicians, to whom beat-keeping is highly automatized, the task of performing a rhythmic pattern with one hand was complicated by having the voice executing the task of keeping a beat. This happened despite the fact that the vast majority of the musicians were perfectly able to perform the opposite dominant combinations.
In the present study, musicians similarly outperformed non-musicians, and Rhythm Scores generally increased with performance on rhythm perception tasks (the Musical Ear Test). Importantly, the di culties for the low-scoring participants of our study did not arise during simple tapping or vocalization of the rhythm in isolation, as all participants were able to perform the rhythm and beat separately in the training session. Rather, the challenges arose when combining the two components. Previous studies 39 have shown that when learning to tap 3:2 polyrhythms, integrated training is more effective than training the hands separately, indicating that learning to tap a bimanual polyrhythm requires that the participants view the left-and right-hand rhythms as one single action as opposed to combining two separate actions.
Consistently with this, we may speculate that the musicians in our study excel at coordinating rhythm and beat, because they perceive it as one uni ed action.
Previous studies, inspired by in uential theories of brain function-also known as predictive coding 40 positing prediction as the fundamental principle behind brain function, have indicated that a hallmark of musical expertise is musicians' ability to form more precise musical predictions 41 . According to predictive coding of music, the rhythm is the acoustical input to our ears, whereas the meter is the brain's posterior expectations that constitute its predictive model. The rhythm can be more or less con icting with the meter, creating stronger or weaker prediction error between auditory input and predictive model 42 . Studies on musical groove have shown that high rhythmic expertise corresponds to a strong predictive model (the meter), which results in less destabilization when synchronizing one's body to the beat of a syncopated groove 43 . Hence, the more precise predictive model makes it easier for musicians to attenuate prediction errors than non-musicians 44 . This ability is however still less pronounced in the non-dominant combinations. Control experiment 1 was inconsistent with dexterity as an explanation of the vertical part of the bodily hierarchy, since the musicians in this experiment-even with a rhythmic pattern with longer IOI's than the beat-still preferred to keep the beat with a limb lower in the hierarchy than the limb performing the rhythm. Instead, the bodily hierarchy seems to be an abstract organization of conducting and supporting roles-conceptualized in music as the tension-creating dichotomy between rhythm and meter-which changes precision-weighted prediction error that shifts dynamically according to where in the body it takes place.
Our data show indisputable evidence of the voice occupying the highest level of the bodily hierarchy. Previous studies have demonstrated shared cognitive resources between rhythm and language-syntactic processes residing in the left hemisphere [45][46][47] , rhythm and grammar learning 48 , the effect of rhythm training on recovering from aphasia 49 and a possible rhythm perception de cit in children who stutter 50 .
Earlier studies 23 have speculated that right-hand dominance in bimanual coordination may be linked to a shared left-hemispheric specialization for speech and right hand. The link between rhythm and language is manifested in several common pedagogical practices, where rhythms are taught and practiced through phonetic vocalization before performance is transferred to a musical instrument, such as in the South Indian musical style 'Konnakol'. Here, a rhythm 'language' based on vocal imitations of drum sounds functions both as an artform in itself and as a deeply integrated part of the training required to play the Mridangam drum 51 . Such practices are built on an experience of rhythms being more easily learned when vocalized. This is in accordance with the superior task performance when participants vocalize rhythms rather than the beat observed in the present study.
The rhythms of the three subtests were sometimes too easy for the musicians and too di cult for the non-musicians-a well-known challenge in studies on rhythm performance. In some cases, we observed a ceiling effect in the groups of musicians and amateur-musicians and almost a oor effect in the group of non-musicians. However, the sample size allowed us to exclude the ceiling effect cases and still obtain a valid result with the linear mixed-effects model by omitting the group factor in favor of using the METscore as an expertise measure. By using MET-scores in addition to the dichotomous categorization of musicianship, we obtained a nuanced view on how musical expertise in uences the coordination of rhythm and meter on a group level as well as on an individual level. However, an adaptable design would be advantageous in future studies, where the Complexity level of each rhythm would depend on the performance of the previous rhythm. An adaptable paradigm may also reveal a difference between musicians and amateur-musicians, which did not appear in the present study, where the best musicians' true potential could not be realized.
Overall, this study suggests a comprehensive bodily hierarchy with the voice occupying the highest level, showing how interlimb coordination is governed by a hierarchical organization of limbs, re ecting an abstract universal organization of conducting and supporting roles.   The percentage of completed combinations. Split in subtests and expertise groups, showing only combinations including the voice. Vertical bars represent 95% con dence intervals.

Figure 4
Rhythm Score as a function of MET-score. The chart is divided in subtest and direction of combinations. 95% con dence intervals are shown.

Figure 5
Rhythmic pattern used in control experiment 2 Figure 6 Distribution of prefeered combinations among musicians. Out of 18 musicians, the dominant combination was preferred by 18 in V+RH, 16 in RH+LH, 18 in LH+RF and 17 in RF+LF.