An Activity Recognition Framework for Monitoring Non-Steady-State Locomotion of Individuals with Parkinson’s Disease


 Background: Fundamental knowledge in activity recognition of individuals with motor disorders such as Parkinson’s disease (PD) has been primarily limited to detection of steady-state/static tasks (e.g., sitting, standing, walking), and identification of non-steady-state locomotion on uneven terrains (stairs, ramps) has not received much attention. Furthermore, previous research has mainly relied on data from a large number of locations which could adversely affect user convenience and system performance. Methods: Here, individuals with mild stages of PD and healthy subjects performed non-steady-state circuit trials comprising stairs, ramp, and changes of direction. An offline analysis using a linear discriminant analysis (LDA) classifier and a Long-Short Term Memory (LSTM) neural network was performed for task recognition. The performance of accelerographic and gyroscopic information from varied lower/upper-body segments were tested across a set of user-independent and user-dependent training paradigms. Results: Comparing the F1 score of a given segment across classifiers showed improved performance using LSTM compared to LDA. Using LSTM, even a subset of information (e.g., feet data) in subject-independent paradigms appeared to provide F1 score > 0.8. However, employing LDA was shown to be at the expense of being limited to using a subject-dependent paradigm and/or biomechanical data from multiple body locations. Conclusion: The findings could inform a number of applications in the field of healthcare monitoring and developing advanced lower-limb assistive devices by providing insights into classification schemes capable of handling non-steady-state and unstructured locomotion in individuals with mild Parkinson’s disease.


I. BACKGROUND
Parkinson's disease (PD) is a neurodegenerative disorder of the central nervous system affecting approximately 40 million people worldwide [1]. PD is characterized by a number of motor impairments and gait disorders such as tremor, postural instability, bradykinesia and rigidity [2]. To monitor the progression of the disease and to measure the efficacy of the treatments, accurate tracking of individual's motor activities is essential. Current approaches for evaluating the motor function of individuals with PD are limited to the observer-based and self-reported methods [3]. In the observer-based assessment, patients are required to travel to a clinic to perform a set of pre-defined tests. The self-reported approach requires individuals to periodically answer a list of questions about their daily activities. Although useful and currently applied in clinical practice, such evaluations may have some limitations. For instance, they are limited to only a few sessions per year and are costly and inconvenient for both patients and medical providers. They are also subjective and do not adequately reflect motor activities in a free-living environment [4]. Thus, there is a need for developing systems that are convenient and provide quantitative measures of ambulatory performance. An activity recognition system could provide clinicians with a quantitative profile of motor function behavior in natural settings and over prolonged periods of time, which could further assist them to objectively adapt treatment strategies. Individuals with PD are more susceptible to fall-related injuries due to postural instability and gait disturbances [5,6]. Real-time monitoring of their locomotion could provide important information about the risk of falls, which could be used subsequently to apply timely interventions and prevent associated injuries leading to better quality of life [7], [8]. Physical activity monitoring could also complement current approaches for detecting disease-specific predictors such as tremor, bradykinesia or hyperkinesia [9,10] to distinguish the symptoms during various locomotor activities. Furthermore, the ability to accurately identify individuals intended locomotion could help inform the control of assistive devices [11].
While activity recognition has received significant attention, few studies have applied that specifically to individuals with mobility disorders. In PD, neurological disorders caused by the disease such as altered gait, tremor, and limited mobility have the potential to complicate and adversely affect the monitoring of patient's physical activity. Studies such as [9,10,12] have reported on activity monitoring of individuals with PD, however there are limitations that need to be addressed. First, the tasks did not comprehensively represent the activities of daily living, focusing instead on recognition of static/steady-state tasks (e.g., walking, sitting, and standing) performed in isolation. Individuals encounter uneven terrain environments (e.g., stairs and ramps), perform dynamic activities and transition from one task to another in their home and community. Deficits in task switching in individuals with mild Parkinson's disease challenge their ability to unconsciously shift their attention from one task to another [13], [14]. This further emphasizes the significance of developing task recognition frameworks capable of handling unstructured and non-steadystate activities.
Second, previous studies primarily relied on input data from entire body, or multiple segments such as trunk, shanks, forearms and thighs [15], [16]. Capturing data from multiple body locations encumbers the patient [17], makes the classification problem more complicated, and increases computation time [18]. An important consideration in activity recognition frameworks is to identify the locations of the body providing the best ability to discriminate between tasks with minimum number of input signals.
Furthermore, previous research has primarily focused on within-subject analysis [10], and the generalizability of such studies to subject-independent scenarios has been an unanswered question.
In this study, we collected data from both healthy subjects and PD patients performing a set of unstructured and non-steady state activities. The tasks were designed to challenge cognitive impairment (e.g., difficulty with set shifting) of the individuals with PD [14], [13]. An offline analysis using two commonly used classifiers including a linear discriminant analysis (LDA) and a Long-Short Term Memory (LSTM) neural network was performed for task recognition. The generalizability of two user-independent paradigms trained on healthy subjects and PD patient data to a novel subject was tested. Subsequently, the results were compared to subject-dependent analysis. The performance of accelerographic and gyroscopic data from bilateral foot, forearm, trunk-pelvis, and their fusion were tested across training paradigms. We hypothesized that a more complex classifier (i.e., LSTM) may be more appropriate for modeling non-steady-state tasks, and would outperform LDA. We further hypothesized that collecting data from multiple body locations might not always be necessary depending on the employed classification algorithm and training data.

II. RESULTS
Within each training paradigm, comparing the F1 scores of a given signal source across classifiers showed improved performance using LSTM compared to LDA, although the differences were not always statistically significant (Fig. 1, Table 1).
Superior performance of LSTM to LDA was most notable in LWp and using able-bodied training data where LDA provided F1 scores ranging 0.19-0.39, while LSTM significantly increased the outcomes to 0.84-0.9.
Comparing F1 scores of a given signal source and classifier across training paradigms did not demonstrate any statistically significant differences between subject-independent paradigms trained on able-bodied and PD patients data (p > 0.05). However, improved performance of subject-dependent relative to subject-independent paradigms was observed. The differences were more significant when LDA was applied (12-133%).
RA was highly misclassified as SA and LW (Table 2). In order for RA to be detected relatively accurate (F1 score ≥ 0.8), using LSTM in subject-independent paradigms appeared to be necessary. Within subject-independent paradigms, signal fusion provided the highest performance (F1 score=0.91) when LSTM was trained on PD patients data, although there was not a statistically significant difference between signal fusion and feet data. Training LSTM on able-bodied data provided F1 scores of 0.85-0.87, and no significant differences across signal sources were observed. RA was best classified using the subject-dependent paradigm and LSTM, where all signal sources provided very accurate outcomes (F1 score= 0.94-0.97).
Similar results were obtained for RD where using LDA with subject-independent paradigms did not provide accurate detection of the locomotion (F1 score < 0.8) (Fig. 1, Table 1). RD was highly confused with SD and LW (Table 2). However, using LSTM and signal fusion in these paradigms improved F1 scores to 0.95. In subject-dependent paradigms, LDA appeared to provide relatively accurate outcomes using all signal sources (F1 score= 0.82-0.9), and LSTM led to a highly accurate recognition (F1 score= 0.96-1).
F1 scores of below 0.8 were reported for SA using LDA with subject-independent paradigms except when signal fusion with able-bodied training data were employed (F1 score=0.91) (Fig.1, Table 1). SA was mostly confused with RA and LW (Table   2). However, LSTM led to F1 scores of 0.85-0.92 for feet and signal fusion. In subject-dependent paradigm, both LDA and LSTM appeared to provide improved outcomes (F1 score=0.86-0.98) for all signal sources.
In SD, only a few signal sources provided relatively accurate (F1 score ≥ 0.8) recognition of the mode using LDA and subjectindependent paradigms ( Fig. 1, Table 1). SD was highly confused with RD (Table 2). Applying LSTM, however, led to F1 scores of 0.91-0.96 using feet, trunk-pelvis, and signal fusion. SD was best classified when subject-dependent training data was employed (F1 score=0.85-1).
Using LDA with subject-independent paradigms resulted in a very poor recognition of LWp (F1 score=0.19-0.6) ( Fig. 1). Even in subject-dependent training all signal sources except signal fusion demonstrated relatively low F1 scores (0.6-0.66) using LDA.
Higher F1 scores were obtained for LWf compared to LWp in most cases, with LSTM outperforming LDA in all training paradigms. Best outcomes were achieved when LSTM was used with subject-dependent training data where F1 scores of 0.98-1 were reported.   Figure. 1 Detection of ramp ascent (RA), ramp descent (RD), stair ascent (SA), stair descent (SD), level-walking preceding (LWp) and following (LWf) ramp/stairs using a linear discriminant analysis (LDA) and a Long-Short Term Memory (LSTM) neural network. Task recognition outcomes using segments biomechanical data were compared across subject-independent (trained on able-bodied and PD patient's data) and subject-dependent training paradigms.

III. DISCUSSION
Locomotion identification strategies have the potential to be complicated by Parkinsonism associated gait disturbances such as slowed movements, rigidity, tremor, and postural instability which could affect the generalizability of the outcomes obtained in healthy subjects to patient populations. Reduced self-regulating mechanisms could highly challenge patient transitions from one task to another throughout the course of disease [22,23], [13], [14], and negatively impact the detection of non-steady-state locomotor tasks. Identifying reliable sources of information, appropriate training data, and classification algorithms could significantly improve system outcomes and patient convenience [24]. Therefore, the purpose of this study was to introduce a framework for continuous classification of non-steady-state activities of individuals with PD and investigate the benefits of different classification schemes for an accurate user intent recognition.
Our first hypothesis regarding better performance of LSTM relative to LDA was supported. Within subject-independent paradigms, using a given signal source data with LSTM outperformed LDA in different locomotor tasks. This was especially notable in LWp where LDA resulted in poor task detection (F1 score= 0.19-0.6) while LSTM remarkably improved the outcomes (F1 score=0.76-0.91). When the locomotion involves combinations of non-steady-state activities (e.g., circuit trials in this study), defining the exact boundaries of non-isolated tasks becomes very challenging. In such scenarios, a given task will have biomechanical characteristics of both the previous and the next activity [25] which could negatively impact the performance of task classification approaches. This is especially reflected when the duration of the task is short, so there is not enough time for biomechanical signals to be adjusted to the ongoing task rather than pervious or next mode. For instance, in this study, level ground turns before and after the single step on the elevated platform ( Fig. 2A), were marked as level walking during classifier training, while they have dominant biomechanical characteristics of the following and the preceding uneven terrains. Learning such complex patterns in the training data and distinguishing between level walking and other modes would be difficult problems for a linear classifier. High misclassification rates are indicative of the same fact (Table 2). According to our findings, models with non-linear decision boundaries (e.g., LSTM) could be more appropriate for modeling non-steady-state locomotion especially in complex datasets such as subject-independent paradigms.
In subject-independent paradigms, the number of signal sources providing relatively accurate (F1 score ≥ 0.8) recognition appeared to be higher using LSTM compared to LDA (Table 1). For example, using LDA for detection of RA, RD and LWp resulted in F1 scores < 0.8 for all signal sources. Similar results were observed for SA, SD, and LWf where only a few signal sources provided relatively accurate outcomes. However, when LSTM was applied, at least two/three signal sources reached F1 scores of 0.8-0.95. This could suggest higher flexibility in selecting input signals location using LSTM compared to LDA. The results also support the second hypothesis, highlighting the fact that using a more complex classification algorithm could provide simpler alternatives to collecting data from multiple body locations. From a practical standpoint, this could improve computational complexity, patient convenience, and instrumentation cost by eliminating the need for sensorizing multiple body segments [26], [27]. For instance, feet signals demonstrated comparable performance to signal fusion in all locomotor tasks when LSTM was applied (p > 0.05), suggesting feet inertial data as the optimal input information that could properly function across a range of activities with minimal instrumentation.
Statistically significant differences were observed between subject-dependent training relative to subject-independent paradigms in most cases when LDA was applied. The lower accuracy of subject-independent paradigms may be indicative of biomechanical differences between healthy subjects and PD patients [28] as well as across PD patients [29], [30]. This could result in high intra-class variations posing a difficult problem for a linear classifier [31]. However, LSTM appeared to generalize better to such differences. Using LSTM led to achieving more comparable outcomes for subject-independent and subject-dependent paradigms (p > 0.05). For instance, comparing feet/signal fusion outcomes using LSTM across training paradigms did not reveal any statistically significant differences between subject-independent and subject-dependent training. This implies that LSTM could allow building subject-independent activity recognition systems. Unlike subject dependent, they are flexible enough to be applied on different users without the need of retraining the model for each person. This would be of higher benefit in individuals with PD where training the system for each user could be inconvenient due to large number of tasks and increased risk of falls, stumbles and injuries during some activities (e.g., non-steady-state transitions).
Neural networks (e.g., LSTM) can receive raw data with minimum pre-processing, alleviating the need for manual feature engineering, thus could minimize engineering bias. Frequency or time-domain features [20], [32] used in conventional machine learning algorithms (e.g., LDA) are problem-specific, and do not generalize well to other problems. For instance, the optimum feature set could vary depending on the target activity. However, conventional algorithms are usually mathematically simple and computationally inexpensive, and do not require large amount of training data. Nonetheless, in this study, employing a mathematically simple classification algorithm such as LDA was shown to be at the expense of being limited to using a training paradigms with lower variability (e.g., subject-dependent) and/or instrumenting multiple body locations.
Continuous task classification implemented in this study has the capability to classify data as they are being captured, which is crucial in the context of developing task monitoring scenarios and assistive technologies. In individuals with robotic orthosis/exoskeleton, it would increase the intuitiveness/volitional behavior of the device and enables smooth transitions between locomotor activities. Continuous classification would also allow adaptive assistance [33], predicting fall risks and intervening on these risks to mitigate falls [34]. The reduced levels of flexibility to adapt to new tasks and difficulty in performing transitions in individuals with PD [22], [23] further highlights the advantages of developing such frameworks for tracking characteristics of transitional periods as well as to track steady-state progress.
The study has some limitations. We considered the toe-off event during the transition period as the initiation of the upcoming task. Toe-off could be a relatively accurate approximation of the task initiation where it occurs close to the physical transition point. However, toe-offs occurred at greater distances may negatively impact the outcomes, since a large portion of the gait cycle labeled as the upcoming locomotor task is still within the previous mode. This could result in high misclassification rates especially during the transition period. In future studies, the problem could be addressed by either modifying task separation events, or separating data into steady-state and transitional periods, and performing separate evaluation for each state [35]. Another limitation of this study is that we used motion capture data and not the data from wearable IMU sensors. Inherent errors in actual IMUs such as bias and drift [36] may affect system performance, thus should be taken into consideration in future studies. Further, a small sample of subjects with mild levels of PD participated in this study. Disease associated symptoms such as tremor and bradykinesia could be more severe in patients with higher stages of the PD, which may deteriorate the performance of classification algorithms.
Future research should investigate the effects of using data from varied levels of disease severity and a larger subject pool to accommodate for the potential across-subject variabilities.

IV. CONCLUSION
We introduced a task recognition framework for tracking relatively unstructured locomotor activities in individuals with mild PD.
Our results demonstrated that, models with non-linear decision boundaries (e.g., LSTM) could be more appropriate relative to linear classifiers (e.g., LDA) for modeling non-steady-state locomotion. LSTM could provide simpler alternatives (e.g., feet data) to collecting data from multiple locations improving user convenience and system's computational complexity for its eventual clinical use. The model could also allow building subject-independent activity recognition systems that are flexible enough to be applied on different users without the need of retraining the model each time. These findings could provide insights into designing activity recognition frameworks for healthcare monitoring and lower-limb assistive devices improving system efficacy and user convenience without sacrificing accuracy.

A. Subjects and Data Collection
Five healthy subjects (4 males, 1 females, age 25.2±2.5 years, height 1.75±0.11 m, mass 66.8±12.2 kg) and five individuals with early stage PD (2 males, 3 females, age 62.8±3.9 years, height 1.72±0.03 m, mass 77.5±17.88 kg, Hoehn and Yahr stage 1 or 2) participated in the study after providing written informed consent to participate in the protocol approved by the Institutional Review Board at The University of Texas Southwestern Medical Center. Patients did not have a deep brain stimulator implanted.
Sixty-six reflective markers were attached to anatomical body locations to track 12 body segments of the arms, legs and torso. A 10-camera optical motion capture system (Vicon, Motion Systems Ltd, UK) was used to capture marker trajectories at 100 Hz in three-dimensional space. Experimental setup consisted of a "terrain park" circuit including an over-ground walkway, a four-step staircase with step height of 0.15 m and depth of 0.30 m, a 2.5 m ramp inclined at 10°, and elevated platforms to connect the stairs and ramp (Fig. 2A). The platform contained a single step of height 0.15 m.
Individuals with PD were asked to walk at their comfortable speed and perform five trials of the circuit for both left leading and right leading legs in the following orders: stair ascent/ramp descent and ramp ascent/stair descent. They were instructed to use handrails when desired. Healthy subjects performed five trails of the circuit while using the handrails, and five sets without using the handrails.

B. Signal Processing and Classification Schemes
Accelerographic and gyroscopic information of anatomical body segments including feet, trunk-pelvis, and forearms were calculated in three-dimensional trajectories and expressed in local segment coordinate systems using Visual3D (C-Motion, Inc., MD, USA). Locomotor modes included ramp ascent (RA), ramp descent (RD), stair ascent (SA), stair descent (SD), and levelground walking (LW). In the training data set, changes of direction on the elevated platform, level-walking data that followed stair/ramp, and level walking preceding stair/ramp ( Fig. 2A) were all marked as LW. However, level-ground walking data that preceded and followed stair/ramp were tested separately and labeled as LWp and LWf respectively. The beginning of each locomotor mode was marked as the last toe-off of the transitioning leg on the previous terrain. Data were exported to MATLAB (MathWorks, Natick, MA) for further analysis.
In order to classify the locomotor activities of individuals with PD, the following classification algorithms, training paradigms, and signal sources were studied

Training paradigms
• Subject independent: Classifiers were trained on able-bodied data and evaluated on PD patient's data.
• Subject independent: Classifiers were trained on PD patient's data and evaluated using leave-one-subject-out.
• Subject dependent: Classifiers were trained on each PD patient's data, and cross-validation was performed within trials of each patient

Signal sources
• Feet • Trunk-pelvis • Forearms • Signal fusion (combination of feet, trunk-pelvis, and forearms data) Signals were divided into sliding and overlapping analysis windows of size 500 ms with 250 ms increment [19]. The classifiers associated each window to one of the locomotor tasks. To classify the tasks using LDA, six time-domain features including minimum, maximum, mean, standard deviation, first and last sample of each window were extracted [20]. LSTM was applied on raw data without employing feature extraction. The number of neurons in the input layer was adjusted according to the number of input signals. For instance, for feet, forearms, and trunk-pelvis data, input layer was comprised of 12 neurons, whereas for the combination of all signals (signal fusion) input layer had 36 neurons. Parameters optimized for LSTM include batch size, number of epochs, and number of hidden units. The optimal value for the parameters were selected not just based on the best outcome but also considering the computation time needed to reach that outcome. For instance, while 200 epochs often provided better outcomes relative to 70, the improvement was negligible. Thus, 70 was selected as the optimal number of epochs. Using similar approach, batch size and the number of hidden units were selected as 50 and 100 respectively. A cross-entropy cost function was used to compare the predicted value with the real value during each epoch. Then, Adam optimizer was applied to reduce the losses by updating networks weights [21].

C. System Evaluation
The subject-independent paradigms were evaluated as follows: in the first paradigm, classifiers were trained on only ablebodied data and were tested on each PD patient data, the results were then averaged across the subjects. In the second paradigm, leave-one-subject-out cross validation was performed on PD patient data. Each time, data from one individual with PD was left out, and the model was trained on the remaining subject's data. The one subject data not included in the training step was used to test the performance of the classifier. The results were reported as the average across the subjects. In subject-dependent paradigm, data from the same subject was used in the training and test sets. Validation was performed using within subject leave-one-trialout cross validation.
To evaluate the models, we used F1 score which is the harmonic mean of precision and recall (1). The F1 score is typically employed for imbalanced datasets where some classes have larger number of samples compared to others. In such scenarios using accuracy as an evaluation metric can be misleading since the majority class could be classified with high accuracy while the minority class is highly misclassified.

= 2 × × +
In (1), precision is the number of correctly classified samples out of total number of samples classified as the target class. Recall presents correctly classified samples out of the total samples of the target class.
Confusion matrices were also computed to quantify the classification results of the proposed scenarios. They provide information about the number of correctly classified as well as misclassified windows across the subjects during each locomotor task. We performed analysis of variance (ANOVA) with the factors being classification algorithms, training paradigms, and signal sources. Post-hoc tests were performed where statistically significant effects were reported (α=0.05).

DECLARATION
• Ethics approval and consent to participate: Participants provided written informed consent to participate in the protocol approved by the Institutional Review Board at The University of Texas Southwestern Medical Center • Consent for publication: Not applicable • Availability of data and materials: The datasets used and/or analyed during the current study are available from the corresponding author on reasonable request.
• Competing interests: Authors declare no conflict of interests.
• Funding: This research was supported by the UT Southwestern Mobility Foundation Center for Rehabilitation Research.