The present study is comprised of four stages (Fig. 1). The first included the design of a new FMS tool, the FUS test, by a panel of researchers. In the second stage, two pilot studies were conducted to examine the feasibility and acceptability of the proposed FUS assessment. The third stage involved the development of the FUS test. The last stage was focused on the examining the validity and reliability of the FUS. In accordance with the COSMIN taxonomy of measurement properties, content validity as well as several reliability measures were evaluated including: inter-rater, intra-rater, test-retest reliabilities and internal consistency were assessed [42]. The study design and testing protocol were approved by an institutional Research Ethics Committee.
Stage 1. Design of the FUS test
The initial step was to establish a research team of 18 members with academic experience in sport, PE, and physical activity. Seven of the researchers had experience with motor development or motor learning research, the others were sport and physical activity researchers or sport coaches. The research team used an evidence-based practice approach to design the FUS test. First, they developed the primary purpose of this new assessment which is to evaluate FMS proficiency in children and adolescents. Then, the research group reviewed the literature to find and evaluate the evidence available regarding FMS assessment and identify strengths and barriers in present assessments. Finally, the research team discussed how to best incorporate evidence into practice [43]. The best available evidence involved: (i) research, including original studies and systematic reviews, (ii) the accumulated knowledge and experiences of the expert team along with potential assessment users (PE teachers, n = 32). Additionally, (iii) we also interviewed 9–14 year old children (n = 75) about their preferred sports.
The research team identified 17 skills (e.g. running, jumping, catching, galloping, marching, climbing, swimming) and 20 motor tasks (e.g. jumping rope, jumping onto and off of a box, control a ball in a slalom dribble, cycling, throwing and catching a frisbee) common in sport and physical activity that represented the range of FMS proficiency of school-aged children (age range = 7–14 years). Subsequently seven researchers among the research team, who had experience with motor development or motor learning research, ranked these skills and tasks according to the best match they had with FMS definition (i.e., early movement behaviors that establish the foundation for later movement experiences [4]). From this shortened list that involved 9 skills and tasks, PE teachers then selected a set of skills and tasks most relevant to the goals of PE. The researchers and PE teachers also ranked the most important considerations associated with assessing FMS in school settings. The final six FMS were selected by considering information collected in a survey of the researchers, teachers, and students, in addition to three meetings held by a collective discussion group comprised of members of the research team. The final selection of skills for the FUS test was guided by the following criteria: (i) degree of sports utility that promote engagement in a broad range of physical actives, (ii) the fundamental nature of the skills to build a foundation for acquiring advanced skills, (iii) ensuring the comprehensiveness assessment, (iv) the ability to identify the components that are most important for skill mastery, and (v) ease of skill evaluation in an applied setting. The attributes considered in selecting the tasks included in the FUS test were: (i) the potential for an accurate skill assessment, (ii) a balanced trade-off between task complexity and simplicity, (iii) the feasibility of conducting the assessment in a school setting, (iv) ability to assess skills under dynamic conditions, (v) attractiveness of the task, (vi) the ability to assess several skills together while performing the task, (vii) ability to modify the task as needed.
Stage 2. Pilot studies
After the initial development of the FUS test, two pilot studies were conducted to examine the feasibility and acceptability of the proposed test format. For the first pilot study, a total of 127 children and 13 members of the research team participated after approval from the school headmaster, teachers, and parents. Students were conveniently sample from primary school in typical Polish town (Biała Podlaska). The sample included students representing all primary school grades (from 1 to 8 grade) in Poland aged 7–14 years, including 61 girls and 66 boys. The number of students in class in each grade ranged from 12 to 17. One week later a second pilot study was conducted by 13 members of the research team which examined 142 children at a different school, but from the same town. Again, the sample involved all primary school grades, including 66 girls and 76 boys. Class groups ranged 15 to 20 students.
The aim of the first pilot study was to assess the children's ability to perform each task within the FUS and to verify if the research team could sufficiently evaluate all skill components. The research team then met and discussed concerns and differences based on their evaluations and make minor changes related to testing environment, equipment and task execution. Based on results of the first pilot study, it was determined that modifications needed to be made to four of the FUS tasks in order to minimize the influence of body build and physical abilities on FMS outcomes. For example, the modifications involved adjusting the height of the hurdles so they were age appropriate, changing the size of the balls, adjusting the distance and size of targets, and providing simplified rules for younger participants. The aim of the second pilot study was to validate the changes that were made to the assessment protocol following the initial pilot study. The changes made after the first pilot study proved valid. No new modifications were made after the second pilot study. Based on the results of the two pilot studies, the research team finalized the FUS test which is described in the following section.
Stage 3. The development of the FUS test
Following the completion of the pilot studies the research team determined that the FUS test would include the assessment of six FMS: running over hurdles, jumping rope, forward roll, bouncing a ball, overhand ball throwing and catching, and kicking and stopping a ball. For each activity, the administrator assesses the level of mastery by evaluating key performance components of each task. Each task is assessed by 5 criteria which have been organized in a mixed process-oriented and product-oriented structure.
Hurdles
The task is to run over three hurdles (obstacles) in the 30 m run as fast as possible. The criteria in this task are as follows: criterion 1. the run-up to the first hurdle is fast, knees are lifted high and elbows are bent; criterion 2. there is no slow down prior to hurdle clearance, and there is clear forward movement during the take-off that precedes hurdle clearance; criterion 3. body moves flat over the hurdle, the trunk leans forward, the trail leg moves quickly forward (without stopping); criterion 4. stride pattern between the hurdles is rhythmic, the number of strides between particular hurdles is the same criterion 5. there is no slow down after hurdle clearance, balance is maintained on landing and the run is continued in a straight line.
Jumping rope
The task is to perform rhythmic and continuous jumps over the rope for 10 seconds. The following criteria apply to this task: criterion 1. jumps are performed continuously (without stopping); criterion 2. jumps are rhythmic and single, with short ground contact time and landing on the ball of the feet; criterion 3. arms are bent and held close to the trunk, and the rope is moved using the rotation of forearms and wrists; criterion 4. knees and hips are slightly bent during flight and landing; criterion 5. jumps are performed vertically with jumps initiating in the same designated area, with the trunk upright, feet parallel at a hip width apart.
Forward roll
The task is to perform a forward roll starting and ending in a squat position with hands on the ground. The following criteria are considered for this task: criterion 1. the task is started in a squat position with both hands placed on the mat and the chin tucked into the chest; both legs are extended equally to push off the ground; criterion 2. rolling over the back is performed without stopping and with the chin tucked; criterion 3. symmetry of movement is maintained while rolling, legs are bent and tucked to the chest; criterion 4. forward roll is performed in a straight line; criterion 5. the task is completed in a squat position with hands placed on the ground in front of the toes.
Ball bouncing
The task is to bounce the ball while walking for 10 m and running an additional 10 m, for a total distance of 20 m. In this task, the criteria are: criterion 1. in the first 10 m of the test the ball is rhythmically bounced at hip height with the top of the ball remaining below the chest while walking in a straight line; criterion 2. the second 10 m of the test is covered running and bouncing the ball with the ball remaining relatively close to the body; criterion 3. the whole distance (20 m) is covered bouncing the ball in front of and slightly to the side of the body. The ball is not carried during the duration of the test; criterion 4. the elbow and wrist are extended when the ball is pushed toward the ground. The ball is controlled with the tips of the fingers; criterion 5. the trunk is upright while the ball is bounced (students aged 7–9) or eyes are focused forward while the ball is bounced (students aged 10–14).
Throwing and catching
The task is to perform a one-handed overhead throw with a run-up, hit the targeted area of the wall with the ball, and then catch the ball with one or both hands after it bounces against the wall. The task must meet the following criteria: criterion 1. the run-up is performed continuously without crossing the line marked on the floor; criterion 2. the throw is initiated with the throwing arm is brought back and the foot of the opposite leg is clearly in front of the body; afterward, the overhead throw is performed; criterion 3. the ball hits the wall above the line (in the target area); criterion 4. the ball is caught, and hands do not touch the chest; criterion 5. the student remains behind the designated line when catching the ball.
Kicking and stopping a ball
The task is to direct the ball to the target area by kicking the ball with the foot and hitting the target area marked on the wall, and to stop the returning ball with the foot. The criteria in this task are as follows: criterion 1. the run-up is performed continuously, and the line marked on the floor is not crossed following the kick; criterion 2. the kicking leg is bent at the knee during the backswing for the kick, the non-kicking foot is placed beside the ball; criterion 3. the ball is kicked with the instep, top, or the side of the foot; criterion 4. the ball hits the target area marked on the wall, returns immediately to the student, and crosses the line of the designated area marked on the floor; criterion 5. after hitting the target area, the ball is stopped with one foot in the designated area.
In terms of scoring, the participant is awarded “1” point for each criterion met and “0” points when the criterion is not met. Points are only given when criterion is clearly satisfied. Two attempts are performed for each prescribed task. The trial with the higher score is used for further analysis. Performances are video recorded and scoring is completed through an analysis of the video-recordings completed by assessor. Alternatively, live scoring based on performance of each task could be conducted immediately following the trial. If live scoring is conducted, it should be done by individuals who are experienced in assessing the components of skills that are evaluated in the FUS test. It is recommended that the assessor completes 8–10 hours of training in the use of the FUS test prior to scoring this at the time of the assessment.
Similar to previous research [28, 44], four levels of mastery for each skill were established: 'full mastery', 'near mastery', 'some mastery’, and ‘poor’. ‘Full mastery’ is achieved when all skill components are successfully performed (scored 5 points). ‘Near mastery’ is obtained when all but one component is performed correctly (scored 4 points). ‘Some mastery’ is accomplished when the execution of three components is correct (scored 3 points). If the performance of two or fewer components is properly executed, the level is considered 'poor'.
Subsequently, the total of all six FUS skills provides the basis for evaluating overall FMS proficiency at four levels. ‘Excellent FMS proficiency’ is obtained when the student fully mastered all the assessed six FUS skills (scored 5 points for each skill) or mastered all but one skill that is ‘near mastery’ (4 points was scored). ‘Good FMS proficiency’ is reached when the student was at least ‘near mastery’ for each FUS skill (scored at least 4 points) and when the student did not meet the requirements established for ‘excellent FMS proficiency’. ‘Elementary FMS proficiency’ level is accomplished when the student scored at the ‘some mastery’ level for each assessed skill (scored at least 3 points) and when the student did not meet the requirements established for the ‘excellent FMS proficiency’ and ‘good FMS proficiency’ levels. The fourth level ‘insufficient FMS proficiency’ is achieved, when skill performance did not meet the requirements established for ‘excellent FMS proficiency’, ‘good FMS proficiency’ and ‘elementary FMS proficiency’ levels.
Prior to testing each skill, students are briefly told why this skill is important, how to perform the task and what skill components will be evaluated. Subsequently, students are provided verbal instructions and they are given a visual demonstration of the whole task by a trained administrator. Participants were provided standardized instructions designed to direct attention externally. According to the constrained action hypothesis, an external focus of attention supports motor learning and performance due to improvements in movement automization, resulting in more optimal performance compared to instructions which direct attention internally or neutrally. Studies have shown that using an external focus of attention is beneficial for throwing [45], catching [46] and jumping [47] in children, including children with Developmental Coordination Disorder [48]. It is worth noting that instructions that support the performance of the whole task also directly addresses at least one criterion in each task. For example, the instruction for the forward roll in the FUS promoted the adoption of an external focus of attention by instructing the participant to “perform a forward roll along the line. All participants perform one familiarization (i.e., practice) trial for each task followed by two formal trials. No verbal feedback on performance is given during and following each trial. All trials are recorded using a video camera or smartphone. The recording method, distance and camera angles are specified with each task to ensure consistency in data collection. For more information on the FUS testing procedure, please refer to the test instructions provided in the manual for teachers “Test of Fundamental Motor Skills in Sport” (Supplement 1).
Stage 4. Evaluating the validity and reliability of the FUS test
This stage of study included 264 school-aged students in grades 1–3 (7–9 yrs; n = 81), 4–6 (10–12 yrs; n = 89) and 7–8 (13–14 yrs; n = 94), including 139 girls and 125 boys from six public schools randomly selected and stratified regarding place of dwelling (2 rural, 2 suburban and 2 urban schools) from a list of schools which participated in a nationwide project promoting extra-curricular sport activities. The project involved more than 100,000 school-aged children and 6,600 PE and early primary school teachers.
To establish content validity, six advisors, all with a research background in motor learning or development, out of the 12 invited to participate in the study completed an online questionnaire. Using a four-point Likert scale (1 = the item is not relevant to the measured domain; 2 = the item is somewhat relevant to the measured domain; 3 = the item is quite relevant to the measured domain; 4 = the item is highly relevant to the measured domain [49]), advisors rated the essential components for the selected FUS items. Means greater than 3.0 for each item were considered acceptable.
Before reliability testing, the research team underwent specific training related to the assessment. The 12 members of research team were divided into six pairs and each pair trained to evaluate one motor task. Each pair of evaluators were content experts in regards to the skill they were tasked to evaluate. Specifically, the hurdling assessment was evaluated by experts in track and field, jumping rope was assessed by experts in combat and strength sports, the forward roll was assessed by experts in gymnastics, bouncing the ball by experts in basketball, throwing and catching the ball by experts in volleyball and handball, and the kicking and stopping a ball was assessed by experts in football (soccer). Subsequently, using records from the pilot studies, each pair individually and together improved their expertise in evaluating the prescribed task per the FUS protocol. This process required approximately 15 hours of their time.
Data collection occurred in May and June 2022 during regular PE classes. Six of the four-person teams (2 research team members and 2 postgraduates) administered and recorded the FUS test in the 6 participating schools. At the beginning of the lesson, participating children were divided into four groups of 3–6 students. All participants were assigned numerical codes to maintain anonymity and facilitate later video analysis. Two members of research group demonstrated, administered, and recorded two tasks: throwing and catching a ball along with kicking and stopping a ball, or jumping rope along with the forward roll. The remaining two tasks (ball bouncing and hurdles) tasks were demonstrated, administered and filmed by one researcher. Each class session took 45–50 minutes, including the introduction and warm-up. A range of 14 to 22 students from each PE class participated in a measurement session. Students performed two trials without any feedback after the familiarization trial, but general positive encouragement was given to all participants following the conclusion of each task. Testing sessions occurred indoors and outdoors. Each trial was videotaped and then evaluated by a pair of trained researchers. All tasks were recorded using one tripod-mounted video camera (Lamax W 9.1, Poland). The MP4 video format and 1920 × 1080 resolution were used in all recordings.
A total of 264 students were used to calculate descriptive data for FMS proficiency, in addition to determining internal consistency (Supplement 2). Inter-rater reliability was assessed by examining how consistent scores were from the two assessors in each pair when they scored trials from the same observed attempts. Each pair of assessors evaluated the performance of 212 students on a task (Supplement 3). The order in which the scored video footage was reviewed was randomized between examiners. To examine intra-rater reliability, researchers investigated the consistency of scores for 130 video recordings of student performances when they were reassessed after a four-week interval (Supplement 4). Test-retest reliability was carried out in two schools, involving 28 students. In both cases only one researcher conducted the measurements during PE classes; there was one week between the two assessments (Supplement 5).
Statistical analyses
Descriptive statistics were reported using means and standard deviations. Validity of the FUS test was achieved by comparing the assessments of the six advisors with the use of content validity index (CVI), consistent with the protocol described by Polit et al. [50]. Specifically, CVI was calculated by dividing the rating number of ‘3’ or ‘4’ provided by experts by the total number of the experts. A CVI greater than or equal to 0.83 was considered acceptable. Inter-rater and intra-rater reliability assessments were conducted using statistics and two-way mixed-effects modeling, single measures absolute agreement, and intraclass correlation coefficients (ICC). Cohen’s kappa coefficients were interpreted according to the classification proposed by Landis and Koch [51], and the percentage of observed agreements was calculated. Additionally, 95% confidence intervals (CI) were calculated for both reliability measures: Cohen’s kappa and ICC. Pearson correlation was used to test relationships between variables, and the internal consistency of the FUS test was assessed using Cronbach’s alpha. Limits of agreement (LOA) defined as mean difference ± 1.96*SD of the difference were calculated. One-sample t-test was used to check for systematic bias, while Pearson correlation was calculated to estimate proportional bias. The test-retest reliability was additionally assessed using two-way mixed-effects modeling, single rater, absolute agreement ICCs. For significance testing, the level of statistical significance was set at alpha = 0.05. Data were analyzed using SPSS 27 for Windows (SPSS Inc., Chicago, USA).