We present an open-source software library for the real-time inverse kinematical analysis of IMU data with user-defined musculoskeletal models using OpenSim 4.1. Full-body IK can be calculated for a single sample in less than 100 ms. On a desktop computer, the software library can solve RTIK at more than 1100 samples per second while tracking the pelvis and the lower extremities and more than 900 samples per second while tracking the full-body kinematics. On a laptop computer, the corresponding throughputs were 700 and 600 samples per second, respectively. Using 12 IMUs to track walking and visualizing the results on a full-body running model, RTIK was solved at 45 samples per second. The drop from the IMU output sampling rate of 60 Hz resulted in a minimal difference in calculated joint ROMs (< 0.3 degrees). The software library allows the use of RTIK virtually without limitations due to location or environment. This opens possibilities for a variety of applications including rehabilitation, ergonomics and human-machine interfaces for controlling collaborative robots. Moreover, it has been shown that a laboratory setting may affect how a person moves [20] and thus it is beneficial that the movement of interest can be observed in the real environment of that movement or behavior.
Execution times and throughputs
We investigated the execution times and throughputs of the IMU-based IK to determine if the output can be considered real-time. Pizzolato et al. [13] used an execution time of 75 ms as the threshold for a real-time system. It was based on a study by Kannape and Blanke [21] in which the subjects were able to identify the displayed motion as self-generated in real-time in over 80% of the cases if the delay in motion display was less than 75 ms. Even with a delay of 210 ms, subjects identified the visualized motion as self-generated in real-time in 50% of the cases. Borbély and Szolgay [10] noted that the IK algorithm of OpenSim 3.3 had an execution time of about 145 ms, thus calculating IK at about 7 Hz and “falling behind the generally accepted practice in human movement recording of at least 50 Hz”. Therefore, a real-time application should achieve IK throughput of 50 operations per second with an execution time below 75 ms for any single operation. With our software library, we aimed to achieve this target by using multithreading and the IK algorithm of OpenSim 4.1.
Another interesting finding by Kannape and Blanke [21] was that subjects modulated their stride based on the delay between the motion and its visualization. Therefore, it is important to minimize the delay when preparing a real-time measurement setup to prevent subjects from altering their gait characteristics based on delayed visual feedback.
Live visualization is unnecessary in applications where IK is an intermediate output that is used to estimate contact forces, instruct a robot arm in rehabilitation applications or calculate gait parameters, to name a few examples. Thus, the performance tests were designed so that they evaluate only the performance of IK, which is the core feature of the software library.
Real motion, such as walking, contains a combination of different orientations, most of which are within a typical model’s joint angle boundaries. The constant identity quaternions used in throughput tests represent the calibration pose reoccurring repeatedly, while randomly generated unit quaternions used in execution time tests often result in unrealistic poses. This makes the IK based on identity quaternions simpler and the IK based on randomized unit quaternions heavier to calculate than the average orientations during walking, or any typical human motion. Therefore, the execution times can be interpreted as the worst performance and the throughput results as the best performance when analyzing human motion without live visualization.
Execution times of the IK operation
Table 1 shows that with one and seven IMUs, the execution times are shorter and vary less for the lower body model (Gait2392) than for the full-body model (Hamner), which implies that the execution time of the lower body model is more consistent than that of the full-body model. Both the mean execution times and the standard deviations are smaller on the desktop than on the laptop. However, the execution times vary less with 12 than with seven IMUs on the full-body model.
For both models, the standard deviations of the execution times are on the same scale as the mean execution times, implying that there is great variation in the execution time. The randomized nature of the used quaternion orientations is a likely contributor to the high standard deviation, because randomized orientations occasionally lead to strange segment orientation combinations that do not reflect valid human motion and take the IK algorithm a varying amount of time to solve. During the development of the test program, it was noticed that the results varied greatly, implying that more than 10 000 IK operations are required to draw lasting conclusions. However, running the test even with 10 000 operations could take up to 20 minutes, so the number of operations was left as it was.
The 95% confidence intervals of execution times are roughly 1% of the mean execution time in all cases, meaning that the execution times stay consistently below 75 ms except when 12 IMUs are used. In that case, the execution times stay consistently below 100 ms. Although measuring full-body motion with 12 IMUs fails to meet the best criterion for delay, it is less than half of the 210 ms delay that marks 50% confidence in perceiving motion as real-time [21]. Therefore, while the execution times with 12 IMUs are not ideal, they are still acceptable. Because the execution times represent the minimum delay from the orientation data retrieval to the moment we can visualize or further analyze the IK output, the number of IMUs in a real-time measurement should be chosen considering the delays that are acceptable for the application.
Throughputs
Figure 2 shows that increasing the number of concurrent processor threads increased the throughput until about eight threads, which was the maximum CPU core number for both computers. Increasing the number of IK threads further had no meaningful effect on the throughput, which was also observed in an earlier study on RTIK [13]. It was observed during the testing that CPU utilization reached 100% at six and eight concurrent threads with the desktop and the laptop, respectively, while memory utilization did not get close to its maximum capacity. In terms of CPU clock rate, the desktop was 26% faster than the laptop. The closest match to this performance difference in throughput is found with two IK threads (30–33% faster).
The increase in throughput by multithreading is especially large when a low number of threads are used. For example, throughput increases from less than 50 to approximately 400 when the number of IK threads increases from one to two on the laptop. Doubled computational capacity alone cannot explain the increase in throughput. The effect is less pronounced but still present on the desktop. Furthermore, the relationship between the throughput and the number of IK threads is clearly nonlinear whereas an earlier RTIK study found it almost linear [13]. No explanation for this phenomenon was found, but it should be addressed in the future development of the software library.
For any number of IMUs and 4 or more concurrent threads, the lower body model with 23 DOFs performed approximately 10% faster than the full-body model with 29 DOFs. Therefore, model selection has a noticeable effect on the performance of RTIK and the model with the smallest sufficient number of DOFs should be chosen to reach maximal RTIK performance.
The number of IMUs had a smaller effect on throughput than model selection and computer hardware. Although it was not reported in the results, the number of IMUs was observed to affect the throughput more significantly when joint angle boundaries of the model are exceeded. Furthermore, in that situation the performance effect of model selection increased and became more important than computer hardware.
The software library is clearly capable of calculating IK at a higher rate than the lower limit of 50 Hz named by Borbély and Szolgay [10], but requires multithreading to reach it with complex musculoskeletal models. The throughputs reported in this study should be sufficient to match the sampling frequency of most IMUs because they typically have a maximal sampling frequency well below 500 Hz. For instance, the maximum sampling rate of Xsens MTw Awinda IMUs is 120 Hz [1].
Error comparison
Because loss of frequency may lead to reduced accuracy in measuring sharp peaks in joint angles, joints where motion direction changes fast are likely to have high ROM error (Fig. 3). During walking ankle flexion (ankle_angle_r and ankle_angle_l) undergoes fast changes, which explains why its ROM error stands out. However, because all ROM errors remain consistently below 0.3 degrees, the effect of the drop in visualized IK from 60 to 45 Hz on ROM is very small.
The ROM error of left hip adduction stands out because it is visibly higher than that of the right hip. The error is caused by an artifact in IMU signal that caused the left leg to be violently jerked to the right after the left toe-off phase. The artifact is probably caused by the distortion of magnetic fields near the ferromagnetic laboratory hardware, which the left leg was closer to.