Validation of a Human Pose Tracking Algorithm for Measuring Upper Limb Joints: Comparison With Photography-based Goniometry

Qingtang Zhu (  zhuqingt@mail.sysu.edu.cn ) First A liated Hospital of Sun Yat-sen University Jingyuan Fan First A liated Hospital of Sun Yat-sen University Fanbin Gu First A liated Hospital of Sun Yat-sen University Lulu Lv First A liated Hospital of Sun Yat-sen University Zhejin Zhang Guangdong AICH Technology Co.Ltd Changbing Zhu Guangdong AICH Technology Co.Ltd Jian Qi First A liated Hospital of Sun Yat-sen University Honggang Wang First A liated Hospital of Sun Yat-sen University Xiaolin Liu First A liated Hospital of Sun Yat-sen University Jiantao Yang First A liated Hospital of Sun Yat-sen University


Validation of a Human
extension:0.65; wrist extension: 0.78) except wrist exion. All the intra-class correlation coe cients were larger than 0.60. The Pearson coe cients also showed high correlations between the two measurements (p<0.001).
Conclusions: Our results indicated that pose estimation is a reliable method to measure the shoulder and elbow angles, supporting RGB images for measuring joint ROM. Our results proved the possibility that patients can assess their ROM by photos taken by a digital camera. Background Joint range of motion (ROM) is a measure of interest in clinical practice as it is signi cant for the diagnosis, functional assessment and treatment evaluation of the upper extremity. It is reported that measurement of ROM is required in more than 80% commonly used function assessment scales for the shoulder and elbow(1).
Conventionally, the measurement of ROM was performed by manual goniometry (2). The goniometer is low-cost and portable, but its reliability highly depends on the rater's experience (3). Moreover, for the procedure is demanding and time-consuming, it is almost impossible to be frequently performed in daily clinical settings.
In addition, with the rapid development of telemedicine, how to determine the joint movement at-distance has peaked the interests of many researches (4)(5)(6)(7)(8)(9). Typical motion capture system could provide accurate kinematics measurement(10, 11) but requires large space for data collection, which makes it costly, not portable, and thus impractical for home-use. Advances in smartphone technology, speci cally the build-in sensors and highresolution cameras, provides a potential platform for joint measurement. The number of mobile application used for clinical assessment have considerably increased in recent years (12). There are two main groups of these applications using the embedded sensor and images taken by phone camera. Compared with the sensor-based method, the photographic-based method provides recordable, documented information, easier procedure to follow and contactless measuring process, so it is well accepted by the patients (13). However, the existing application still need raters to mark on the photos (14), which means it could not actually reduce the workload of therapist nor the subjectivity of results.
Therefore, an object, accurate and automatic method is desired. Recent advances in human tracking algorithm offers a new option for this task. These algorithms can detect joint points and calculate joint angles from RGB images, providing an attractive alternative for at-distance measuring (11,15). In this study, we employed OpenPose, one of the most widely used method proposed by Cao et al(16) to estimate joint angles from RGB images. Previous articles have evaluated the reliability of OpenPose-based system in gait analysis (17) and Parkinson rating(18, 19). Ota et al. also compared OpenPose and VICON (a 3D motion capture system) in measuring lower limb joint angle and found signi cant associations of the two methods (11). However, the utility of OpenPose in the assessment of upper limb angle remains unclear. Herein we constructed a measuring setup based on this algorithm, using RGB images to measure upper limb movements. This study evaluates the proposed method's accuracy and reliability by comparing the results with photography-based goniometry.

Participants
Thirty healthy young adults (20 males, 10 females, 22-35 years old), with no claim of medical history nor impairment in the upper limbs participated in this study. This study was approved by the institutional review board of our institution (2021-387). Estimated sample size was calculated by PASS software (version 15.0) using equivalence test for the difference between two means. With a type one error (a) of 0.05, power (1-b) of 0.95, equivalence limit of 10 degree, and standard deviation of 10, that a minimum of 27 samples would be required.
All subjects were given full explanations about the motion tasks. After that, written consent was obtained for the use of their images for research purposes.

Measurement setup
The measurement environment is shown in Supplementary Figure1A. Three commercial digital cameras (HIKIVISION DS-2CD3T56FWDV2-I3) were positioned around the eld (one in the front and two in the sides). The height of the cameras was 1.5m and the distance between the camera and subjects was 3m. To ensure the consistency of the participant placement, feet markers were placed 3m away from the cameras.

Motion tasks and parameters extracting
We designed a 6-task procedure including shoulder abduction, shoulder elevation, elbow exion, elbow extension, wrist exion and extension (Shown in Supplementary Figure1B). To control the impact introduced by rotation, all the interest angles in our design were fully presented in either sagittal or coronal view. Participants were asked to stand in the eld and perform the motion tasks one after another. To ensure their performances were the same as we recommended, we set a screen in front of participants with word and video instructions. Moreover, their motion videos were real-time displayed on that screen as well. All photographs were taken from the anterior side, except the elbow exion was taken from the lateral side (one for each side).

Automatic measurement
The landmarks of each joint were estimated by the Openpose Human Pose Estimation library (version 1.5.0)(16). The coordinates for landmarks of joints were further extracted, and skeleton models were rebuilding accordingly.
Then, the joint angle was calculated by corresponding coordinates.

Digital photography-based measurement
After the automatic measurement, the photography-based measurements were conducted by using the same images. The angle of joints was measured by two hand surgeons individually, applying a screen goniometer software to the images displayed on the computer screen (The main reason of screen-goniometry was to make sure the posture present to measurement system and human researchers were identical. The validity of this method have been previously con rmed (20,21)). To minimize the uncertainty of manual assessment, these images were reassessed by the same researchers at an interval of one week. The landmarks included the center of the shoulder, elbow and wrist, axis along the center of the upper arm and forearm, and central axis along the metacarpals. During the measurement, observers were free to locate the landmarks after reading the instruction.
During this procedure, observers were not allowed to see the results of automatic measurement or another observer's report.

Data processing and statistical analysis
The mean values of the four measurements (2 researchers * 2 round) were considered as the standard results for comparison. All measurements are presented as mean±standard deviation (means ± sd). The deviation between the automatic assessment and standard results and the 95% con dence interval (CI) were calculated to assess the accuracy. The intra-class correlation coe cient (ICC) was also performed between the standard and the proposed measurement for assessing the agreement. Next, the results were analyzed using Bland and Altman analysis (22). The upper limits of agreement (LOA) were considered reference values to judge if the proposed measurement could be a reliable method for upper limb ROM. Since it has been con rmed that the results were in complete agreement when using Openpose to analyze the same image twice, the repeatability of the automatic methods was not assessed. In comparison, the repeatability of manual measurement was evaluated by comparing the test-retest results. In addition, to con rm the validity, linear regression analyses were conducted to compare the manual and system measurement data. R-square was calculated to evaluate the correlation between different methods.

Page 5/12
The measuring results in the shoulder, elbow and wrist measured by two observers and the human tracking algorithm are summarized in Table 1 and Figure 1. The example of automatic measurements result is shown in Supplementary Figure 2.

Pose estimation
The poses of participants were successfully estimated in all but two images, and both were because of the person detection failure (The reason of error was due to these pictures included more than one person and the angle calculation was performed on the wrong target). The success rate was 99.44% (358/360).

Difference between observers
The results of the inter and intra-observer comparison are presented in Table 2 and Table 3. There was excellent agreement between observers, with mean difference ranging from 0.08 to 4.33 and ICC value ranging from 0.897 to 0.951. The intra-observer comparison also indicates a good consistency, the mean differences between test and re-test measurements were less than 5 degrees.

Difference between observer and machine
As shown in Table 2, the observer-system differences were comparable to the inter and intra-observer difference.
The most signi cant difference was found in wrist exion (8.96±12.71; 95%CI: -12.24--5.68). In the other 5 motions, the 95% con dence intervals of the mean differences between manual and automatic assessment were less than 5 degrees. Similarly, the Bland-Altman plots also indicate acceptable agreements for the shoulder and elbow motions. In comparison, the conformity for wrist motions is relatively poor (Figure 2), as the credible intervals were more than 10 degrees. Then, the consistency was further evaluated by ICC values. The results suggested a good to excellent agreement (ICC>0.60) in all motions ( Table 3). The lowest consistency was found in shoulder elevation and wrist extension (ICC=0.620), while the best was found in elbow extension (ICC=0.831).
Additionally, linear correlations between system and observer measurement were also demonstrated (R ranges from 0.45 to 0.71, p<0.001 Figure 3). Sys: The automatic measuring system; Doc1_1: The rst measurement of the rst doctor, the rest are in the same manner; Doc_mean: The average measuring value of doctors; sd: standard deviation; CI: con dence interval Table 3 Summary of the intra-class correlation coe cients

Discussion
The range of motion (ROM) of the upper limb is an important clinical parameter to various functional evaluations before and after treatment. Conventionally, ROM was assessed manually using the standard goniometer. This procedure is time-consuming and requires expertise. However, various reasons such as nancial, geographic, or busy schedule could prevent patients from clinic visiting (23). Therefore, telemedicine has become popular as a method of patient evaluation. Photographs are easily obtained and disseminated in our daily life. Getting movement parameters from remote photographs has potential to decrease the cost of physical evaluation. Human pose tracking algorithms can automatic calculate joint angles from RGB images and provide a new option for the remote evaluation. However, the reliability of this method is extremely important before the using in clinical settings.
This study sought to evaluated the reliability of an automatic goniometry method. In our analysis, we found that the algorithm-based method has acceptable reliability compared to human observers. The results indicate that the differences between the proposed method and the average value of observers are less than 5 degrees in shoulder and elbow motions, comparable to the inter and intra-observer differences. Compared to that reported in previous studies, these differences are notably more minor than that of visual estimation (24,25) and are comparable to inertial sensors(26) and depth camera (27). Therefore, the proposed method may have great accuracy and reliability in measuring ROMs of the shoulder and elbow.
In this study, the greatest observer-machine difference was found in wrist exion, and the mean value was 8.96 degrees. However, this reliability is still competitive compared to other image-based applications(8, 28). Nevertheless, as seen in the Bland-Altman plots, we found the angle was over-estimated by the system in most cases. Thus, we speculate this might be a systematic error that could be correct when a larger sample size is available.
It is di cult for participants to keep their posture still during measuring, as previous studies indicated (29,30). According to the literature, several methods were employed to minimize this problem. Cook et al. used a wooden triangle with xed internal angles to support the joints of interest during assessing (31). In comparison, Chang et al. adopted a glass plate as hand support to reduce movement during the 3D scanning process (32). More commonly, many studies choose the 3D motion capture system to achieve data collecting (33)(34)(35) simultaneously and thus minimize the differences caused by involuntary posture changing. Our study compared the results of the automatic system and human observers by measuring the same image individually. In this way, we can conclude the actual differences between the two methods without impacting the inconsistency of motions. The concept of obtaining joint ROM from photographs is not new. Previous studies have indicated that it is accurate and reliable compared with conventional clinical goniometry (20,21). Additionally, the results of image-based goniometry could be more consistent than that of the conventional way in some cases (21). This present study also proved the value of screen goniometry as a reliable alternative for measuring, with slight inter and intra-observer differences.
There are still some limitations of our study: Firstly, the participants were limited to young, healthy persons, and did not included the elderly nor the patients, making the results statistically less robust and lessening the generalizability of the proposed method. Secondly, motions with rotation were not assessed because it was hard to estimate 3D motions through 2D images. Although it could be an inevitable technical error, this issue will be the aim of our future studies. In addition, angle of joints may contain the movements of several joints (For example, the angle of shoulder joint includes the movement of the scapula, thorax, and thoracic spine) which lead to inaccurate of measurement, but we believe that is still good enough for telemedicine system. Another drawback is that the accuracy of our method depends on the compliance and cooperation of participants to some extent. If the subject cannot properly understand our purposes, the results can exhibit deviation.

Conclusions
This study demonstrates a reliable and valid method to measure joint ROM of the upper limb using RGB photographs. It provides an exciting alternative to remote evaluation, adding objective and accurate information on upper limb mobility. Besides, the proposed method is a fully automatic technique. Users can obtain reliable kinematics parameters personally without travel to clinical centers. However, it would be interesting to implement a study with a larger sample of patients or the elders with movement disorders and study more motions.
Abbreviations ROM: range of motion; LOA: limits of agreement; ICC: intra-class correlation coe cient; CI: con dence interval;

Declarations
Ethics approval and consent to participate Comparison of the measurement results of the 6 motions.
Sys: The automatic measuring system; Doc1_1: The rst measurement of the rst doctor, the rest are in the same manner; Figure 2 Bland-Altman plots for inter-rater agreement.
Sys: The automatic measuring system; Doc: Doctor; Doc_mean: The average measuring value of doctors; This plot compares the individual measurement result with the average value of doctors. The x-axis represents the mean value; the y axis represents the inter-rater difference. The dotted lines represent the limit of differences The linear correlation between raters Sys: The automatic measuring system; Doc_mean: The average measuring value of doctors;

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.