This work aimed at evaluating the performance of a new automated approach for the recognition of clinically relevant events in the STS and at comparing its performance to the human visual assessment. The results obtained offer a double contribution in prospective researches since not only we quantified the discrepancy between the two methods, but we also compared it against the maximum error made in repeated visual measures. Despite the significant lower systematic bias in repeated evaluations, the comparison between visual assessments and the proposed approach showed similar values of maximum absolute error.
More specifically no statistical differences were found in the identification of the Initiation and the Sitting events in SP trials and in the identification of only the Sitting event in CT trials. The worsening of the observed agreement during CT movements was generally in line with our expectation, as ULoAs values could be affected by uncertainties due to the kinetic modifications resulting from the standardisation of the movement. For instance, the initial GRF deflection effect is highly dependent on each individual’s movement strategy  and could be reduced by the lower quantity of momentum produced under constrained speed, complicating the identification of the Initiation event.
Another important consideration highlights the intrinsic subjectiveness of human evaluations . One could consider the slightest oscillation either as an extension of a contiguous static phase or as the limit of a movement transition. Moreover, visual assessments can vary across repeated evaluation and differ in individuals, depending on their professional experience [30, 31].
Nonetheless, with a maximum estimated discrepancy of 0,200 s [0,039; 0,361] and 0,340 s [0,014; 0,666] respectively for normal and standardised speed, the proposed method may not be suitable for evaluations on a single patient, where the expertise of health professionals plays a key role in the diagnosis process, often requiring a high level of abstraction . It is also true that a great quantity of data collected through sensor systems is inherently noisy, and for this reason, its analysis should consider and handle some degree of uncertainty . In this context, the presented algorithm can be used as a solid basis for artificial intelligence methods to provide accurate, faster, and scalable results in the field of big data analytics with interesting applications in Human Activity Recognition tasks [33, 34].
This is the first study that aims at comparing human and automated assessments in the identification of a complex motion pattern in the STS movement relying on data collected through a force plate. Previous works [35–39] evaluated the performances of various algorithms developed for the identification of Sit-to-Stand and Stand-to-Sit postural transitions using data acquired from inertial sensors. In particular, a recent paper of Atrsaei and colleagues  validated the accuracy of a new routine based on a single device against visual assessments on-camera recordings of STS movements, obtaining levels of agreement above 94%, in terms of positive predictive values and sensitivity. As a direct comparison with the present study, the use of inertial sensors is usually preferable since they can be also applied in non-clinical environments . However, their measurements are strongly influenced by the inter- and intra-individual variability of the movement [20, 42], limiting the recognition of the STS motion pattern to the simple discrimination of static and dynamic phases. Conversely, our choice to use a force plate has some doubtless limitations in terms of costs and portability but the strong advantage of providing easier interpretable results, on which it is possible to identify clinically significant movement events and phases. Under this point of view, supported by the recent advances in the field of machine learning, it is possible to consider this data as a valid ground-truth reference to train specific IMU-based approaches in a finer recognition of the STS. This transition could allow the development of accessible, wearable rehabilitation tools, combining the discriminating power of gold-standard instruments with the limited dimensions and costs of inertial sensors.
The results highlighted in this work should be appraised in consideration of some limitations. The standardisation of posture at the beginning of the trials and the homogeneity of the sampled population in HAR1 and HAR2 could have restricted the applicability of the presented evidence. Both the setting of the starting position and the individuals’ characteristics influence the STS motion strategy and consequently, the value of the vertical GRF and the successful outcome of the movement [20, 43, 44]. Hence, the presented algorithm must be tested across a different set of initial conditions (e.g. different uses of the arms during the rising, different fatigue condition of the subject, etc.) and different population groups to explore its real value for clinical applications. Moreover, the limited sample size and the heterogeneous professional background of the assessors could have impacted the final estimate of the visual measurements. Further evaluations performed by a larger sample of qualified health professionals could provide a more precise picture of the reliability of the proposed method.