Inter- and Intra-observer Reliability of the “assessment of Motor Repertoire- 3 to 5 Months” Based on Video Recordings of Infants With PWS


 BackgroundThe “Assessment of Motor Repertoire - 3 to 5 Months”, which is part of Prechtl's General Movement Assessment (GMA), has been gradually applied to infants with genetic metabolic disorders. However, there still have been no studies on the application of GMA for infants with Prader-Williams Syndrome (PWS).Aimsthe purpose of this study was to determine inter- and intra-observer reliability of the assessment tool in PWS population.Study designReliability and agreement study.SubjectsThis was a cross-sectional study of 15 infants with PWS born at average gestational age 38 weeks.Outcome measuresStandardized video recordings of 15 infants with PWS (corrected ages 3 to 5 months) were independently assessed by three observers. Kappa and ICC statistics were applied in inter- and intra-observer reliability analysis.ResultsThe overall reliability ICCs values of “Motor Optimality Score” (MOS) ranged from 0.84 to 0.98 and the regarding pairwise agreement ranged between 0.86 and 0.95 in inter- observe reliability. In addition, ICC values for MOS ranged between 0.95 and 0.98 for respectively testers agreement in intra-observer reliability.The complete agreement reliability (100%) was achieved in subcategories of “Fidgety Movements” and “Movement Character” for the inter- and intra-observer. Moderate to high inter- and intra-observer reliability were found in subcategories of “Repertoire of Co-Existent Other Movements”, “Quality of Other Movements” and “Posture”, with kappa values ranging between 0.63 and 1.00. Conclusionhere were high levels of inter-and intra-observer agreement in the “Assessment of Motor Repertoire - 3 to 5 Months” for infants with PWS. It will be possible to carry out standardized quantitative assessment on the motor performance infants with PWS.


Introduction
Prader-Willi syndrome (PWS) is a disorder characterized by a genetic imprinting defect associated with chromosome 15q11-13 [1]. In western countries, the estimated prevalence of PWS in different populations ranges from 1/10000 to 1/30000, while there is a lack of epidemiological data in China [2][3][4][5]. The phenotype of PWS gradually emerges with development. The main characteristics of PWS are a variety of physical, cognitive, and behavioral defects [6]. The most marked features are hypotonia in infancy, hypogonadism, short stature, obesity, developmental delay, and intellectual impairments [1]. The occurrence of symptoms is mostly related to age; not all clinical phenotypes are expressed in all patients, and the severity of disability varies from different patients [6]. Meanwhile, PWS is one of the important causes of symptomatic morbid obesity, and early diagnosis and reasonable intervention are crucial to improve the quality of life of children, prevent serious complications and prolong life [2][3][4][5][7][8].
Therefore, it is of great clinical signi cance to identify any abnormalities in motor performance, particularly in infants with PWS as early as possible before genetic diagnosis has been con rmed in order to provide timely early rehabilitation intervention.
The General Movements Assessment (GMA) of the Prechtl's Method is a safe, valid, reliable and noninvasive assessment tool for identifying infants at risk of poor neurodevelopmental outcomes, particularly cerebral palsy, with the aim of intervening early and improving outcomes [9][10]. Initially, the assessment of general movements(GMs) was mainly used to evaluate infants with brain injury. Recently, it has been gradually applied to infants with certain genetic and metabolic diseases [10][11][12][13][14][15].
GMs were divided into two stages according to the development process: the wriggling movement stage (wriggling stage) and the dgety movements stage ( dgety stage). GMs assessment includes an overall assessment and a detailed assessment. The motor optimality score (MOS) of the "Assessment of Motor Repertoire -3 to 5 Months" in the detailed assessment during the dgety movements stage can not only be used to quantify the e cacy of early rehabilitation interventions but also to quantitatively analyze the relation between the data of the motor repertoire for an infant and the data obtained from follow-up studies [15][16][17][18].
The researchers conducted a reliability study of the "Assessment of Motor Repertoire -3 to 5 Months", and the results of the study show that inter-observer reliability in the assessment of the total "Motor Optimality Score" for infants with risk factors was very high as the intraclass correlation coe cient (ICC) was 0.87, and ICCs for the pairwise analyses ranged between 0.80 and 0.94 [19]. Based on the results of the reliability study of the MOS scale, Dafne Herreroa explored the relationship between the MOS of the repertoire in 3-to 5-month for infants with Down syndrome, and the eventual motor performance. The MOS of infants with Down syndrome was higher than in infants who were later diagnosed with cerebral palsy but lower than in infants with normal neurological outcome [15]. However, until now, there have been no studies on the application of GMs in the early assessment and prediction of motor developmental outcomes for infants with PWS. In addition, su cient inter-and intraobserver reliability is required in order for different testers to correctly use an instrument for scienti c and clinical purposes.
The "Assessment of Motor Repertoire-3 to 5 Months" has not been used as a test of inter-and intraobserver reliability for infants with PWS so far. Therefore, the purpose of this study was to determine inter-and intra-observer reliability of the "Motor Repertoire -3 to 5 Months" assessment tool based on video recordings of infants with PWS.

Methods
The procedures and the reporting of this study should be carried out in accordance with the Guidelines for Reporting Reliability and Agreement Studies (GRRAS) which was published in 2011 [20]. While reporting the reliability and agreement of the study, participant and observer population, the process of assessment and data analysis, and the reliability and agreement of the reports should be described in detail [9].

Study design
This was a cross-sectional study to assess the degree of inter-and intra-observer reliability and agreement. Three advanced trained raters (observers A, B, and C), observers A and B were blinded to the medical history of infants with PWS, evaluated the same 15 GMs video records of infants with PWS by applying the "Assessment of Motor Repertoire-3 to 5 Months" and complying the standardized assessment procedures [18]

Participants
Fifteen cases of infants with PWS were selected between November 2014 and December 2017 from the rehabilitation department of Children's Hospital of Fudan University. The study was given ethical approval by the Research Ethical Board of Children's Hospital of Fudan University (certi cate number 2019(NO.025)). Participants meeting the following inclusion criteria were included: (1) there are de nite PWS genetic diagnosis reports; (2) At least one GMs video recordings had been carried out in the "Fidgety Movements" period; (3) the written consent was required by the legal guardian of participants, allowing the video recordings to be used for research purposes. Participants who met the exclusion criteria below will be excluded: (1) the brain MRI reports showed severe brain injury lesions; (2) the infant suffered from any other genetic metabolic disorders.
In all there were 15 participants in this study, 12 of which were males and 3 of which were females. Birth weight was 1802 to 3250 g, and the average weight was 2622.13 g. Gestational age ranged from 33 to 38 weeks in 3 infants and 38 to 40 weeks in 12 infants, and the average gestational age was 38 weeks. PWS genetic diagnosis was con rmed between 3 weeks to 65 weeks after birth, and the average genetic diagnosis age was 18.33 weeks. All 15 infants with PWS had received early rehabilitation intervention during the rst four weeks of life. Further details on the distribution of infants with PWS characteristics are outlined in Table 1.

Observers
All three observers had completed the training courses of GMs and obtained the assessment quali cation certi cate offered by the GMs Trust before starting this study [21]. They comply with the guidelines and grading criteria of the assessment of general movements and the "Assessment of Motor Repertoire-3 to 5 Months" scales [18, [22][23]. The three observers were assigned labels by the characters A to C; observer A and observer B were early intervention therapists for infants, and observer C was a rehabilitation doctor. All of them have a rich experience in GMs assessment in clinical practice, and observer A and observer B obtained advanced training course quali cation certi cates four years ago. During the study period, observers A and B did not know the speci c medical history and risk factors of the participants and were only provided with the age of the infants at the time of video recordings. On the other hand, observer C had been granted the advanced training course quali cation certi cate two year ago. As the organizer of present study, observer C knew the basic medical history of the participants but had no information on the previous GMs assessment results.

Video recordings
We searched GMs videos by the ID number of infants meeting inclusion and exclusion criteria in the GMs application of diagnosis and treatment system databases of children's hospital, Fudan university. GMs videos were taken according to the guidelines outlined by C. Einspieler [24]. (1) The infants were lying in the supine position with minimal clothing, no dummy (paci er) or toys, and in an adequate behavioural state. Sequences that included crying and fussing were discarded. (2) GMs video recordings last for 3 -5 minutes at a time.
Accordingly, a total of 21 infant video records were retrieved. If the participant has more than one GMs video records, one of them will be randomly selected as the research object. If there was only one GMs video record for the participant, it was taken as the research object. Finally, 15 representative sequences of GMs video records were included for the reliability and agreement study. All of the GMs video recordings last 6.9 minutes at most and 3.1 minutes at least, with an average time of 4.3 minutes.

The assessment tool
"Assessment of Motor Repertoire-3 to 5 Months" [16,22] is an observational instrument designed to assess the GMs video recordings of infants. It is divided into three main domains of observation, namely "Movement Patterns" (23 items), "Postural Patterns" (13 items), and "Movement Character" (8 items). The overall result (44 items) is taken as a basis for the "Motor Optimality List", based on the scoring of ve subcategories, the rst of which rates "Fidgety Movements" as normal (12 points

Assessment procedure
Observers assessed the 15 GMs video records in independent locations by using computers. Observers were allowed to watch the videos repeatedly, but not to communicate with each other. The score sheet was numbered consecutively from 1 to 15 by depending on the birth date of the infants, and the video records were assessed in a pre-arranged order [24][25].
Each observer completed the assessment of the GMs video records within the same day for the interobserver reliability part of the study. However, the time interval between the two assessment time points to the same GMs video records for the same observer were required to be more than three months in the intra-observer (re-test) reliability part of the study.

Statistics
All data will be analyzed by statisticians using the SPSS (Statistical Product and Service Solutions) statistical package program (version 17.0, SPSS Inc., Chicago, Illinois, United States). The respective subcategories agreement in the "Assessment of Motor Repertoire -3 to 5 Months" was identi ed by means of kappa statistics or expressed in terms of percent agreement if the kappa value could not be determined. The statistical measure of the Cohen's kappa is used to determine intra-and inter-observer agreement, considering the agreement by chance [19]. The interpretation of results complies with the guidelines by Landis and Koch [26], who classify a κ value of <0.20 as poor agreement, of 0.21-0.40 as fair, of 0.41-0.60 as moderate, of 0.61-0.80 as good, and of 0.81-1.00 as very good agreement. Intraclass correlation coe cient (ICC) statistics was applied to examine pairwise agreement of sum scores among the observers in the "Assessment of Motor Repertoire -3 to 5 Months". ICCs are correlation coe cients that allow comparison of two or more repeated measurements [26]. For the "Motor Optimality Score", ICC statistics were applied to examine pairwise intra-and inter-observer agreement (A-B, A-C, B-C, A-B-C), and agreement among all three observers (A-B-C). The measurement error was termed "Sw", it was calculated as the square root of the mean within-subject variance. The 95% con dence intervals (95% CI) for correlation coe cients were also analyzed [27].

Results
3.1 Intra-and inter-observer agreement of sum scores among the observers in the MOS As expected, the infants that participated in the study received scores ranging in the lower part of the 5-to 28-point total MOS. Intra-and inter-observer agreement for the total MOS were reported in accordance with ICC values, as were shown in Table 2 (inter-observer agreement) and Table 3 (intraobserver agreement). ICC values ranged between 0.86 and 0.95 for pairwise agreement in inter-observer.
Overall inter-observer agreement was 0.93. Meanwhile, regarding pairwise agreement in intra-observer, ICC values ranged between 0.95 and 0.98.
The measurement error (Sw) ranged from 2.54 to 2.66 between the various pairs of observers in the assessment of the MOS. The overall Sw was 3.82 between the observers. The Sw ranged from 2.58 to 2.90 between the same pairs of observers before and after in the assessment of the MOS. Table 2 Inter-tester reliability of the total "Motor Optimality Score" pair wise and between the three observers (A-C), comparing Intraclass Correlation Coe cient (ICC), 95% con dence intervals (95% CI), and within subject standard deviation (Sw).

Intra-and inter-observer agreement of scores among the observers in the respective subcategories
The assessment results in subcategories "Fidgety Movements" and "Movement Character" were achieved at a complete agreement among the three observers (A-C), in which "Fidgety Movements" total were absent (1 point) and "Movement Character" total were abnormal, but not cramped-synchronised (2 points). Consequently, no kappa value for dgety movements and movement character could be calculated among the three observers (A-C). Therefore, agreement between observers A-B, A-C and B-C regarding the subcategory "Fidgety Movements" and "Movement Character" were expressed in terms of percent -100% (Table 4).
In the other subcategories, data from all 15 infants were included in the analysis. Moderate to high interobserver reliability was achieved in the assessment of "Repertoire of Co-Existent Other Movements" and "Quality of Other Movements", with kappa values ranging between 0.63 and 1.00 and one single value reached the perfect agreement in repertoire of co-existent other movements(B-C). The assessment of "Posture" resulted in very good kappa values in pairwise inter-observer agreement (A-B, A-C, B-C), where the kappa value totals were 1.00 (Table 4).
For the Intra-observer agreement (A, B, C) between the respective subcategories, data from all 15 infants was included in the analysis. The subcategories"Fidgety Movements,""Movement Character"(A, B, C), and "posture" (B) were expressed in terms of percent -100%. In the other subcategories, moderate to high intraobserver reliability was achieved in the assessment of "Repertoire of Co-Existent Other Movements" (A, B, C), "Quality of Other Movements" (A, B, C), and "Posture" (A, C) with kappa values ranging between 0.63 and 1.00 (Table 5). Table 4 Inter-tester reliability of "Assessment of Motor Repertoire-3 to 5 Months" subcategories.  Table 5 Intra-tester reliability of "Assessment of Motor Repertoire-3 to 5 Months" subcategories. Observers

Discussion
The clinical phenotypes of PWS vary greatly with age. During the early period after birth, the motor performance is particularly affected in infants with PWS. Most cases present as severely hypotonic, inactive and sometimes almost motionless in the neonatal period. Despite the fact that they persistently suffer from hypotonia, muscle weakness, and severe motor development delays, the young infants with PWS gradually become more responsive and present more spontaneous movements weeks or months later [6, 28]. Motor problems occur after birth and continue into adulthood for the rest of their lives. PWS patients scored well below the normal standard range on standardized motor performance tests [29]. Holm VA was the rst to propose the PWS clinical scoring diagnosis system based on the characteristics of the clinical symptoms in 1993. This system subdivided and summarized each clinical manifestation by score according to age, realizing the clinical standardized diagnosis of PWS. For newborn infants with obvious hypotonia, di culty feeding and gonadal dysplasia, the possibility of PWS should be suspected [30]. However, in terms of the method of carrying out a standardized quantitative assessment of the motor performance particularly in individuals with suspected PWS or infants that were diagnosed with PWS at a young age (less than 6 months old), there is still a lack of relevant research in this eld so far.
It is extremely di cult to carry out prospective behavioural analyses of rare disorders [11]. However, GMs assessment (video recording assessment) has been applied to early screening and assessment in genetic metabolic diseases [31]. Christa Einspieler with her colleagues conducted a longitudinal study on the predictive value of movements and postures for infants with Smith-Magenis syndrome on the development of neurological de cits later in life. The ndings, which were signi cantly reduced motor repertoire, absent dgety general movements, abnormal posture and jerky and monotonous overall movements, indicate a severe motor impairment by no more than 4 months of age [11]. Similar results have been shown in other genetic metabolic disorders, where the infants with the most severe phenotypes show no dgety movements when aged 3-5 months, while their neurodevelopmental status was found to be more or less normal otherwise [12,[32][33]. Until now, there has been little systematic data on the early neurodevelopmental functioning of infants with PWS, and this study will be as the basis of preliminary research to document a behavioural manifestation of the syndrome as early as Months" was divided into three subtypes: normal dgety movements (12 scores), abnormal dgety movements (4 scores) and absent dgety movements (1 scores). The assessment results of all observers (A-C) were completely consistent with that of absent dgety movements(1 scores)based on the severely hypotonic, inactive and sometimes almost motionless infants with PWS in the neonatal period seen in this study. Because Kappa statistical analysis is based on multicategorical variables, the Kappa value cannot be calculated with a single variable for the outcome of absent dgety movements in this study. Therefore, the percentage of agreement represented the detailed impression of the degree of the reliability and agreement in the subcategory "Fidgety Movements". The complete agreement reliability (100%) was achieved in both the interobservers reliability (A-B, A-C, B-C) and the re-tests reliability of the observers (A-C) in this study. Inter-and intra-observer reliability in the subcategory "Movement Character" in the "Assessment of Motor Repertoire-3 to 5 Months" was exactly the same as the reliability of "Fidgety Movements". (3) The points achieved in the subcategories"Quality of Other Movements"and "Posture were calculated based on the sum of the number of normal and abnormal items in their respective category entries. Accordingly, the result was not simply based on the inter-and intra-observer agreement in each item, even if agreement reliability values on each item of these subcategories were analyzed to be low, which didn't necessarily affect the MOS and ultimately the ICC values of reliability. Moderate to high inter-observer reliability were achieved in the assessment of "Quality of Other Movements" among Compared with previous studies in genetic metabolic disorders, the reliability study of detailed GMs assessment for Smith-Magenis syndrome for inter-observer agreement indicated Kappa values between 0.82 (assessment of posture) and 1.00 (assessment of dgety movements) and is similar to the research results of these two parts in this study [31]. In the reliability study of high risk infants, inter-observer reliability in the assessment of the total MOS was high between observers as ICC was 0.87, and ICCs for the pairwise analyses ranged between 0.80 and 0.94, which is consistent with the results of this study [19]. However, the agreement reliability regarding the subcategory "Postures" and "Movement Character" in this study was better than that of infants with a high risk of brain injury, in which the "Movement Character" kappa was 0.54-0.84 and the "Posture" kappa was 0.39-0.56.
In this study, the "Assessment of Motor Repertoire-3 to 5 Months" in infants with PWS we have high reliability in inter-and intra-observers, which is not only related to the characteristics of early motor performance in infants with PWS, but also has a great correlation with the following factors: (1)Video acquisition process requires strict regulations to ensure high quality assessment of video recordings, which plays an important role in the accuracy of GMs assessment results based on Gestalt perception theory.
(2) Three observers had completed the advanced training courses of GMs and had a rich experience in GMs assessment in clinical practice. In addition, we hold a GMs assessment quality control meeting once a week to discuss di cult GMs cases and the standardized assessment process. All of these clinical practices help to greatly improve the accuracy and consistency of the GMs assessment results among observers.
Of course, there are also some limitations in this study. On the one hand, the population characteristics of PWS are less representative in this study. More than 90% of the infants with PWS were of the "Total Optimality Score" under 9 points and the sample size is also very limited in this study. On the other hand, the assessment results were completely consistent with that of absent dgety movements(1 scores)for infants with PWS in this study. Thus, the assessment of "Fidgety Movements", which itself showed good inter-observer agreement, had a signi cant effect on the ICCs for the total MOS. Therefore, the sample size and study units should be increased in later studies to reduce research bias.

Conclusion
To our knowledge, this is the rst report on clinical cross-sectional study of GMs assessment in patients with PWS. This study especially focused on the inter-and intra-observer reliability of the detailed GMs assessment which is the "Assessment of Motor Repertoire-3 to 5 Months" in infants with PWS.
Particularly, the "Motor Optimality Score", the subcategory of "Fidgety Movements", "Movement Character" and "posture" were expressed with perfect agreement in inter-observer and re-test reliability.
This study provides a basis for further analysis of the characteristics and correlation between PWS, cerebral palsy and normal infants in early performance movements based on the "Assessment of Motor Repertoire-3 to 5 Months".