Performance analyses of step-counting algorithms using wrist accelerometry

doi:10.21203/rs.3.rs-2183645/v1

Download PDF

Research Article

Performance analyses of step-counting algorithms using wrist accelerometry

https://doi.org/10.21203/rs.3.rs-2183645/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Step count is one of the most used real-world (RW) outcomes for understanding physical functioning, activity, and overall quality of life. In the current investigation, we systematically evaluated the performances of modern wrist-accelerometry-based algorithms based on peak detection, autocorrelation, template matching, movement frequency detection, and machine learning on a common dataset that included continuous walking trials of varying speeds and regularities. The accuracies were computed with respect to the ground truth step count derived using smartphone-based video recordings. On average, the movement frequency detection-based and ML-based algorithms outperformed the other algorithms showing the highest accuracies across all trials (95.3 ± 6% to 96.7 ± 6.41%). The other algorithms showed varied accuracies ranging from 59.8 ± 41% to 90.11 ± 10.3%. All algorithms showed relatively lower accuracies for 1-minute slower walks and showed relatively higher accuracies for the longest walking trials of 6-minute. Except for two algorithms (autocorrelation and template-based), all algorithms showed no significant effect of the device type (CentrePoint Insight Watch vs GT9X) as well as device placement (left wrist vs right wrist) on accuracies for all trials. The smartphone-based step detection algorithm showed the lowest accuracies and variability suggesting the need for fit-for-purpose algorithms in step count estimation using wrist accelerometry. The current investigation provides essential evidence to facilitate the application of wearable digital health technologies in clinical research and care.

Biomedical Engineering

step count

wrist accelerometry

wearables

digital health

actigraphy

Walking is associated with physical health and functioning which are essential for performing activities of daily living and maintaining a good quality of life (QoL)[1]. Recent advancements in wearable sensors and algorithms for processing sensor data have created numerous opportunities for objective assessment of walking in the real-world (RW). This shift from laboratory to RW is critical not only for assessing the true impact of the underlying clinical conditions on a patient’s function but also for objectively monitoring the efficacy of any intervention aimed at addressing such conditions. The development of various gait-related digital outcomes using wearables is an ongoing effort. Among many such outcomes, step count derived during walking is one of the most used RW outcomes for understanding physical functioning, sedentary behavior, physical health, and overall QoL across diverse populations [2]–[5]. Step count is an intuitive, objective, and easily understandable [6] outcome of walking. Further, step count monitoring can drive a behavioral change that can lead to increased steps and physical activity [7]. As a result, step count estimation and validation have gained substantial traction in wearables and digital health technologies (DHT) as well as in consumer product development [8].

Step counting using wearable sensors, however, can be analytically challenging in the RW environment [8]. Many factors can impact the accuracy of step count estimation, such as device models, wear locations, and study populations. While simple algorithms can work very well counting steps with sensor data collected from ankle-worn devices, step detection based on sensor data collected from the wrist requires much more sophisticated algorithms, as the arms produce diverse secondary movements with and without the typical swing movement, resulting in highly variable movement profiles in the RW. On the other hand, Regardless of these challenges, wrist-worn devices are increasingly becoming the mainstream choice for clinical research, thanks to the comfort, easy acceptance, non-stigmatization, and thus high compliance associated with them [9]. Therefore, it is critical to assess the accuracy and performance of the algorithms for step detection based on wrist-worn accelerometry data. Such algorithm performance assessment is essential for meaningful interpretation when wrist-worn devices are used as DHT in clinical trials and care.

To address this need, we systematically evaluated the performances of modern wrist-accelerometry-based algorithms on a common dataset that includes continuous walking trials of varying speeds and regularities. The data were recorded using ActiGraph GT9X Link and CentrePoint Insight Watch (CPIW) – two widely used research-grade activity monitors by ActiGraph (ActiGraph LLC, Pensacola, FL). Three recent step count algorithms from ActiGraph and its partner were evaluated. To benchmark their performance, we also selected three state-of-the-art count algorithms from the literature. The algorithms evaluated in this investigation utilize a variety of processing methods such as peak detection, autocorrelation, template-matching, movement frequency detection, and machine learning. Therefore, the results reported in this investigation represent the performances of a truly diverse group of algorithms on a single dataset, something that is not previously reported. As a secondary objective, we also evaluate whether these algorithms are agnostic to the device placement (left wrist vs right wrist) and device model (CPIW vs GT9X). These findings provide the first systematic comparison across various algorithms and unbiased evidence regarding the strengths and weaknesses of these algorithms for making an informed decision for algorithm selection in clinical applications.

Data were collected from a cohort of 17 healthy volunteers (mean age: 38.1 ± 13.5 years, range: 19–60 years) who performed several walking trials outdoors on a straight paved walkway. Each participant performed all walking trials in a single visit. All participants provided verbal informed consent prior to their participation. The participant demographics are reported in Table 1.

Table 1

Study participant demographics. sd: standard deviation
	n = 17
Age (years)	mean (sd):	38.1 (13.5)
Age (years)	Range:	19–60
Height (m)	mean (sd):	1.73 (0.1)
Height (m)	Range:	1.6–1.9
Weight (kg)	mean (sd):	83.26 (19)
Weight (kg)	Range:	56.8-118.6
Sex	10 male, 7 female

Sensors

All participants wore both, GT9X Link and CPIW on each wrist. The GT9X devices were placed on the wrist using custom-built holsters and wristbands. The data were sampled at 100 Hz and 32 Hz for the GT9X and the CPIW devices, respectively.

Testing Procedures

1-Minute Walk Tests (1MWTs)

Three 1MWTs were performed at self-selected comfortable, fast, and slow speeds. Additionally, two 1MWTs were performed at a self-selected comfortable speed with two conditions – i) holding a phone in one (any) hand and ii) (any) hand in a pocket. Thus, a total of five 1MWTs were performed in the order of comfortable, fast, slow, phone-in-hand, and hand-in-pocket conditions.

6-Minute Walk Test (6MWT)

A clinical test for evaluating walking function and physical endurance where participants walked continuously for six minutes was performed. The participants walked around two cones placed approximately 150 meters apart on a paved walkway. During the test, breaks were given if needed with the timer kept running. All participants were asked to walk at a comfortable speed.

Ground truth

Videos were captured using an iPhone 11 Pro Max for all trials. Videos were segmented for each trial and each file was carefully analyzed to count the steps. Step counts reported by multiple raters were compared to ensure the agreement in the measured step counts.

Algorithms

Algorithm	Method	Brief description
Lee et al. [10]	Peak detection with adaptive thresholds	Lee et al. algorithm presents a novel algorithm for smartphone-based step detection. The algorithm’s robustness comes from its ability to negate the effect of device pose and walking conditions (step mode). The algorithm extracts peaks and valleys from the acceleration magnitude data. Next, adaptive magnitude thresholds consisting of step average and step deviation are applied to suppress pseudo peaks or valleys that result from step mode and device pose changes. Adaptive temporal thresholds are applied to time intervals between peaks or valleys to consider the time-varying pace of human walking or running for the correct selection of peaks or valleys. A step is defined as a combination of a peak and its adjacent valley.
Adaptive Empirical Pattern Transformation (ADEPT) [11], [12]	Template matching	The ADEPT algorithm is used for estimating step counts. The algorithm segments walking strides (two consecutive steps) in the vector magnitude of triaxial accelerometry data. The algorithm uses a pre-defined template i.e. a pattern of stride and detects its repetitions by maximizing the local correlation between the series of time-transformed templates and vector magnitude of raw acceleration at every time point. The time transformation manipulates the duration of the pattern, allowing for the detection of free-living strides of varying speeds.
Centrepoint Insight Watch v1 (CPIWv1)	Machine learning with autocorrelation	The CPIWv1 algorithm is a combination of ActiGraph’s Moving Average Vector Magnitude (MAVM) [13] and the University of West Florida’s (UWFv1) [14] step-counting algorithms. For the CPIWv1, the raw data is split into 4-second non-overlapping windows and each window is fed to UWFv1. In UWFv1, nine features such as average, peak-to-peak, zero crossings, etc. are extracted and fed to a Random Forest (RF) of 10 decision trees to classify the 4-second window as walking/running or other. If walking/running is detected, autocorrelation is used to calculate the step count. If the window is detected as “other”, the MAVM algorithm calculates a moving average of vector magnitude to find a reference level for the signal to perform thresholding and other operations. Times between zero crossings and amplitudes of sinusoid peaks/valleys are used to qualify if a section of the signal should be counted as a step or not.
Centrepoint Insight Watch v2 (CPIWv2)	Machine learning with autocorrelation	The CPIWv2 algorithm is a combination of improved UWFv1 (referred to as UWFv2) and MAVM. Compared to the UWFv1 algorithm, the UWFv2 algorithm included the RF classifier trained on the datasets collected from participants of wider age-ranges performing activities as well as null activities (non-walks), thus improving the accuracy of step detection and eliminating false positives. CPIWv2 also included improved parameters for selecting peaks of autocorrelation providing improved accuracy for step detection.
CSEM	Adaptive motion frequency tracking combined with an activity classifier	Custom-developed step detection algorithms by ActiGraph’s partner institute - CSEM [15]. The raw accelerometer signals are used to classify the activity on a sample-by-sample basis. Several features extracted from the raw signals such as signal strength, rhythmicity, and frequency stability are used as predictors in a binary classification tree, and activities are classified into rhythmic (e.g., run, walk) and non-rhythmic activities. For rhythmic activities, a spectral analysis is performed and the movement frequency, i.e., the cadence (steps/min), is computed from the accelerometer signals. Steps can thus be periodically added to the total step count. For the non-rhythmic activities, a time domain analysis is performed in which each oscillation is considered. If the oscillation satisfies a set of conditions for duration, strength, and shape, the total step count is increased by one, otherwise, no steps are counted.
Femiano autocorrelation (auto) and windowed peak detection (WPD) [16]	Autocorrelation and peak detection	The Femiano auto algorithm uses an autocorrelation algorithm designed by Rai et al. [17] while the Femiano WPD is based on the original WPD algorithm by Gu et al. [18]. Both algorithms were validated by Femiano et al. for step detection using wrist accelerometry data [16]. The Femiano auto algorithm, similar to ADEPT, utilizes the repeatable profile observed in the raw acceleration signal recorded during walking. Windows of the signal are correlated with subsequent windows for different window sizes. If the autocorrelation coefficient surpasses a set threshold value, the movement state is identified as walking and the corresponding time lag is stored. The Femiano WPD algorithm scans for peaks in the acceleration signal and applies various constraints, e.g. peak magnitude and distance to the previous peak, to eliminate false peaks. The remaining peaks are counted as steps.

Performance metrics and statistical analyses

The following performance metrics were derived for each algorithm. The percent error (relative error) was calculated as \(100*\left|\frac{Estimated-Measured}{Measured}\right|.\)The accuracy was calculated as 100-percent error and is expressed in percentages. The root mean squared error (RMSE) were computed as well. Bland-Altman plots (difference plots) [19] were generated to visualize the mean bias and the limits of agreement (LoA; ±1.96 SD mean bias). Further, the agreement between the measured step count and the estimated step count was evaluated using Pearson’s correlation between the two variables and was expressed in Pearson's correlation coefficient, r. Paired t-tests were conducted to evaluate the effects of device type and placement locations on the accuracy of step count estimation. The significance levels were set at 5%.

Among all the algorithms evaluated, CSEM and CPIW algorithms performed the best across all trials (Fig. 1). CSEM algorithm estimated the step counts with the high accuracy (96.66 ± 6.41%) and RMSE of 11.4 steps with all trials combined. The two versions of the CPIW algorithm, CPIWv1, and CPIWv2, performed at 95.09 ± 7.47% (RMSE: 10.57) and 95.31 ± 6% (RMSE: 9.4), respectively, with CPIWv2 showing the lowest RMSE among all algorithms. ADEPT algorithm performed at 90.11 ± 10.28% accuracy (RMSE: 17.94). Femiano autocorrelation and windowed peak detection algorithms performed at accuracies of 82.7 ± 22.2% (RMSE: 43.05) and 89.13 ± 10.71% (RMSE: 31.41), respectively. Lee et al. showed the lowest accuracy of 59.8 ± 41.4% (RMSE: 155.19), with the largest variability in accuracy as shown by the standard deviation values.

To use step count as a clinical outcome of RW function and mobility, it is important to evaluate the algorithm performance under different walking conditions. In general, accuracy tends to be lower during slow walks (Fig. 2). For 1MWT at slow speeds, Lee and Femiano auto algorithms performed poorly showing the lowest mean accuracies and high RMSE (Lee: 59.1 ± 40.56%, RMSE = 52.15; Femiano auto: 62.11 ± 33.47%, RMSE = 42.96), where ADEPT (82.14 ± 18.84%, RMSE = 20.54), Femiano WPD (84.4 ± 12.9%, RMSE = 19.08), CPIWv1 (89.35 ± 13.14%, RMSE = 14.65), CPIWv2 (92.67 ± 8.94%, RMSE = 10.6) and CSEM (94.54 ± 11.69%, RMSE = 9.63) showed moderate to high accuracies for the same trial. All algorithms achieved high accuracies (> 90%) for the 6MWT, except for Lee (62.24 ± 41.13%). Overall, CSEM and CPIW algorithms shower superior performances across the various walking trials.

The Bland-Altman analyses were conducted to assess the agreement between the estimated and measured step counts (Fig. 3). The CSEM algorithm showed the lowest mean bias of -1.33 steps and the smallest limit of agreement (LoA) of -24 to 21 steps. The CPIWv1 and CPIWv2 algorithms showed mean biases of -1.19 and − 4.05, respectively. The LoA for CPIWv1 and CPIWv2 were found to be -22 to 19 steps and − 21 to 13 steps, respectively. The mean biases and LoA of Femiano auto, Femiano WPD, and ADEPT were found to be 1.72 (-83 to 86) steps, -12.47 (-71 to 47) steps, and − 4.55 (-39 to 30), respectively. Lee showed the largest mean bias of 52.75 steps with LoA of -233.71 to 339.21 steps.

Pearson’s correlation statistics confirmed a strong linear relationship between the estimated and measured step counts across all walking conditions for most algorithms, except for Femiano auto and ADEPT (p < 0.05), Table 3. The ADEPT algorithm showed significant correlations for all tests except for the 1MWT Fast (p = 0.399). Femiano autocorrelation algorithm showed a significant correlation between estimated step count and measured step count for 1MWTs with slow (p = .016), fast (p < < 0.05), and hand-in-pocket conditions (p = 0.027) only.

Table 3

Pearson correlation statistics (*p < 0.05)
Algorithm	1MWT Comfortable	1MWT Slow	1MWT Phone in Hand	1MWT Fast	1MWT Hand in Pocket	6MWT
Lee	r(58) = 0.5, p = 0.0*	r(58) = 0.4, p = 0.001*	r(58) = 0.4, p = 0.002*	r(58) = 0.66, p = 0.0*	r(58) = 0.38, p = 0.003*	r(58) = 0.47, p = 0.0*
Femiano auto	r(58) = 0.07, p = 0.593	r(58) = 0.31, p = 0.016*	r(58) = 0.21, p = 0.115	r(58) = 0.54, p = 0.0*	r(58)=-0.29, p = 0.027*	r(58)=-0.13, p = 0.332
Femiano WPD	r(58) = 0.51, p = 0.0*	r(58) = 0.59, p = 0.0*	r(58) = 0.4, p = 0.002*	r(58) = 0.74, p = 0.0*	r(58) = 0.36, p = 0.005*	r(58) = 0.58, p = 0.0*
Adept	r(58) = 0.73, p = 0.0*	r(58) = 0.33, p = 0.009*	r(58) = 0.65, p = 0.0*	r(58) = 0.11, p = 0.399	r(58) = 0.4, p = 0.001*	r(58) = 0.85, p = 0.0*
CSEM	r(58) = 0.84, p = 0.0*	r(58) = 0.68, p = 0.0*	r(58) = 0.95, p = 0.0*	r(58) = 0.73, p = 0.0*	r(58) = 0.74, p = 0.0*	r(58) = 0.91, p = 0.0*
CPW V1	r(58) = 0.8, p = 0.0*	r(58) = 0.46, p = 0.0*	r(58) = 0.81, p = 0.0*	r(58) = 0.66, p = 0.0*	r(58) = 0.79, p = 0.0*	r(58) = 0.95, p = 0.0*
CPW V2	r(58) = 0.75, p = 0.0*	r(58) = 0.73, p = 0.0*	r(58) = 0.89, p = 0.0*	r(58) = 0.77, p = 0.0*	r(58) = 0.84, p = 0.0*	r(58) = 0.97, p = 0.0*

Paired t-tests showed that there were no significant differences in algorithm accuracies for step count estimation between the two devices (Table 4A), except for the Femiano autocorrelation algorithm for the 1MWT comfortable trial (t(58) = 2.05, p = 0.049*). Similarly, there was no significant difference in step count estimation accuracies between the two placement locations (left vs right; Table 4B).

Table 4

Paired t-test statistics to compare accuracy with respect to the device type and placement location.
A. Device Comparison: GT9X Link vs CPIW
Algorithm	1MWT Comfortable		1MWT Slow	1MWT Phone in Hand		1MWT Fast			1MWT Hand in Pocket	6MWT
Lee	t(58) = 1.25, p = 0.222		t(58) = 1.46, p = 0.155	t(58) = 1.31, p = 0.201		t(58) = 0.82, p = 0.419			t(58) = 1.02, p = 0.316	t(58) = 1.48, p = 0.151
Femiano auto	t(58) = 2.05, p = 0.049*		t(58) = 0.45, p = 0.655	t(58) = 0.71, p = 0.484		t(58) = 0.37, p = 0.716			t(58) = 1.36, p = 0.183	t(58) = 1.27, p = 0.216
Femiano WPD	t(58) = 0.51, p = 0.613		t(58) = 0.67, p = 0.509	t(58) = 1.35, p = 0.188		t(58) = 1.32, p = 0.197			t(58) = 0.65, p = 0.52	t(58) = 1.2, p = 0.239
Adept	t(58) = 2.88, p = 0.007*		t(58) = 1.9, p = 0.068	t(58)=-0.63, p = 0.533		t(58) = 0.55, p = 0.584			t(58) = 0.84, p = 0.41	t(58) = 1.2, p = 0.24
CSEM	t(58)=-0.61, p = 0.549		t(58)=-0.02, p = 0.981	t(58)=-0.54, p = 0.595		t(58)=-0.44, p = 0.663			t(58)=-0.14, p = 0.891	t(58)=-0.68, p = 0.503
CPIW V1	t(58) = 0.27, p = 0.79		t(58) = 0.17, p = 0.867	t(58)=-0.53, p = 0.602		t(58) = 1.23, p = 0.228			t(58) = 0.57, p = 0.571	t(58) = 0.55, p = 0.588
CPIW V2	t(58)=-0.38, p = 0.708		t(58) = 0.53, p = 0.598	t(58)=-0.25, p = 0.805		t(58)=-0.94, p = 0.353			t(58) = 1.74, p = 0.092	t(58) = 0.47, p = 0.644
B. Placement Comparison: Left vs Right wrist
Lee		t(58) = 0.05, p = 0.962	t(58)=-0.32, p = 0.749		t(58)=-1.66, p = 0.108		t(58)=-0.74, p = 0.466	t(58)=-1.17, p = 0.252		t(58) = 0.28, p = 0.781
Femiano auto		t(58) = 0.93, p = 0.362	t(58) = 0.32, p = 0.752		t(58) = 1.82, p = 0.08		t(58) = 1.31, p = 0.201	t(58)=-1.19, p = 0.243		t(58) = 0.72, p = 0.477
Femiano WPD		t(58) = 0.57, p = 0.571	t(58) = 1.23, p = 0.23		t(58) = 0.97, p = 0.338		t(58)=-0.28, p = 0.784	t(58) = 0.09, p = 0.926		t(58)=-1.62, p = 0.116
Adept		t(58) = 0.73, p = 0.471	t(58)=-0.02, p = 0.985		t(58) = 1.24, p = 0.225		t(58) = 0.09, p = 0.927	t(58)=-1.13, p = 0.267		t(58)=-0.3, p = 0.768
CSEM		t(58)=-0.25, p = 0.803	t(58)=-0.69, p = 0.497		t(58) = 0.84, p = 0.409		t(58) = 0.81, p = 0.426	t(58) = 1.0, p = 0.324		t(58)=-0.43, p = 0.667
CPIWv1		t(58)=-1.22, p = 0.231	t(58)=-1.49, p = 0.146		t(58) = 0.58, p = 0.568		t(58) = 0.73, p = 0.469	t(58) = 0.66, p = 0.512		t(58) = 1.75, p = 0.09
CPIWv2		t(58)=-1.5, p = 0.144	t(58) = 0.24, p = 0.81		t(58)=-0.36, p = 0.721		t(58)=-0.47, p = 0.641	t(58)=-0.13, p = 0.899		t(58)=-0.41, p = 0.688

The current investigation presents the validation of seven step-counting algorithms on a common dataset consisting of walking trials at varying speeds and conditions. Four of these algorithms (Lee, Femiano auto, Femiano WPD, and ADEPT) were selected as state-of-the-art based on the literature on step estimation using accelerometry. The other three algorithms (CPIWv1, CPIWv2, and CSEM) represent ActiGraph’s existing and partner algorithms for step count estimation. The current investigation builds on the validation efforts of step-counting algorithms, specifically based on wrist accelerometry. It has been reported that the accuracy of wrist-based devices is lower than the waist-based devices in detecting steps [20]. Wrist-worn devices, however, can achieve higher compliance than traditional hip- or waist-worn monitors [6] thanks to their convenience and comfort. The algorithms for processing wrist-derived raw accelerometer data are rapidly evolving. Therefore, validation efforts such as the current investigation, are critical to informing the accuracy of step count algorithms specific to the wrist location so that these algorithms can be confidently integrated into clinical trials utilizing DHT.

The need for fit-for-purpose algorithms

The current investigation for the first time systematically evaluated a diverse group of step-counting algorithms for evaluation. These algorithms represent smartphone-based step detection (Lee et al.), smartphone-accelerometry-based algorithms transformed for wrist-accelerometry-based step detection (Femiano et al.), and fit-for-purpose wrist-accelerometry-based step counting algorithms (ADEPT, CPIWv1, CPIWv2, and CSEM). In the current investigation, Lee et al. showed considerably lower accuracies compared to the other algorithms. Lee et al. was originally developed for smartphone-based step detection with average accuracies of over 98.6% for any combination of step mode (walking, running, etc.) and device pose (texting, swinging, pocket, etc.). The results from the current investigation, however, suggest that this algorithm cannot be used “as is” for wrist-worn accelerometry. Note that this should not be interpreted regarding the performance of Lee et al. for its original use as smartphone-based step detection. Femiano algorithms represent a class of algorithms that were originally developed for smartphone-based step detection and were tuned and validated for wrist accelerometry. For Femiano et al., we implemented Femiano auto and WPD as reported in [16] and no further tuning of algorithms was performed on the current dataset. Even with the optimization, Femiano et al. performed at slightly lower accuracies compared to the previously reported accuracies (95.9% for Femiano WPD and 94.2% for Femiano auto, reported by [16]). These results highlight the need for developing and validating fit-for-purpose algorithms such as ADEPT, CPIW, and CSEM for wrist-accelerometry-based step count estimation.

Strengths and weaknesses of current algorithms

All algorithms developed for wrist-worn accelerometry performed step count estimation at acceptable accuracies. Most algorithms were agnostic to the device type (GT9X vs CPIW), except for Femiano auto and ADEPT for 1MWT at a comfortable speed (Table 4A). All algorithms showed no significant differences (p < 0.05) in accuracies with respect to the device placement (left wrist vs right wrist) (Table 4B). These results suggest the adaptability and robustness of these algorithms in extracting step counts using wrist accelerometry. It was found that ML-based algorithms (CPIWv1, CPIWv2) and CSEM’s algorithm delivered superior performance compared to other algorithms (ADEPT, Lee, Femiano auto, and Femiano WPD). ML-based algorithms are data-driven and training data is required for developing such algorithms, whereas CSEM’s algorithm is driven by a movement frequency detector which does not require specific training. The other algorithms employ traditional step-by-step processing of the raw signals to extract step count. These algorithms do not require any training data as such, but the algorithm parameters are highly specific to one dataset and may or may not be appropriate for other datasets/populations/conditions. ADEPT, a template-based algorithm, requires a predefined pattern. We utilized a publicly available pattern [11], [12], which may or may not be appropriate for the patient populations or different walking conditions.

For widespread adoption of these algorithms across clinical trials, the algorithm parameters might need to be tuned accordingly, particularly for the algorithms such as Femiano et al. and ADEPT where slight modifications in certain parameters can have significant differences in accuracy. Femiano et al. reported that the empirical tuning of parameters from its original version (by Gu et al. [18]) was needed to optimize the accuracy. For ADEPT, a slight change in the ‘pattern_dur_grid’ parameter significantly affected the accuracies, and the parameter was set to [0.72–1.7] (i.e., the template pattern duration between 0.72 and 1.7 seconds). The rationale for selecting these values was based on the normal stride duration values for healthy young and older adults [21]. In general, the task of tuning these parameters can be subjective, ad hoc, data-specific, and time-consuming. This restricts the adoption of such algorithms on real-world data where ground truths are not available. In the future, if these algorithms are to be utilized for different populations, these parameters need to be tuned on a controlled dataset with ground truth, before applying to real-world data. Even in such a case, the accuracies may still be uncertain without the validation with ground truth data in the RW. Further, the performance discrepancies of ADEPT and Femiano et al. can also be attributed to the type of activities included in the current dataset. Our current dataset includes trials at three categories of self-selected speeds (comfortable, slow, and fast) and two categories with distinct hand postures (hand-in-pocket and phone-in-hand) of overground walking compared to Femiano et al. dataset which had walking trials at comfortable speed but under different conditions (walking, running, nordic, arm movement with and without walking). The average length of a walking bout for Femiano et al. validation data was 5.22 ± 0.84 minutes. ADEPT also utilized the dataset with long (bout range: 2.5-4 minutes, distance: 1500 feet), and straight-walking bouts without any turns [11]. For the current investigation, all walking trials were shorter (1-minute) long except for the 6MWT. Interestingly, both Femiano et al. algorithms and ADEPT showed their highest accuracies for the 6MWT- the longest walking trial in the current dataset.

Current limitations and future recommendations

The current investigation uses the raw acceleration data recorded during the walking tests which is a simpler task than step counting in the RW. For the application in the RW, the algorithms need to specifically “detect” walking bouts before step count estimation. All of these algorithms (except for Lee et al.) have a walking bout detection stage before extracting step count information. Activity or walking bout classification applied as a top layer before extracting gait features such as step count can help to isolate walking segments and increase accuracy and computational efficiency by avoiding the processing of non-walking data. Lastly, the current investigation uses the algorithms with their default parameters (and the template pattern for the ADEPT) and no individual optimization was performed as it was outside the scope of this investigation. The performance of these algorithms might likely improve further after their respective parameters are tuned based on the specific dataset or walking conditions.

Conflict of Interest

Rakesh Pilkar, Dawid Gerstel, Matt Biggs, Tyler Guthrie, Sarah Sloan, Joe Nguyen¹, Matthew R Patterson, Ali Neishabouri, and Christine Guo are employed by ActiGraph. Christopher Moufawad el Achkar, Philippe Renevey, Abolfazl Soltani, Damien Ferrario, and Mathieu Lemay are employed by CSEM, ActiGraph’s partner institute. Marta Karas and Ethan Toole declare no competing interests.

F. C. Bull and A. E. Hardman, “Walking: a best buy for public and planetary health,” p. 2.
K. S. Hall et al., “Systematic review of the prospective association of daily step counts with risk of mortality, cardiovascular disease, and dysglycemia,” Int. J. Behav. Nutr. Phys. Act., vol. 17, no. 1, p. 78, Dec. 2020, doi: 10.1186/s12966-020-00978-9.
B. del Pozo Cruz, M. N. Ahmadi, I.-M. Lee, and E. Stamatakis, “Prospective Associations of Daily Step Counts and Intensity With Cancer and Cardiovascular Disease Incidence and Mortality and All-Cause Mortality,” JAMA Intern. Med., Sep. 2022, doi: 10.1001/jamainternmed.2022.4000.
M. Sheng et al., “The relationships between step count and all-cause mortality and cardiovascular events: A dose–response meta-analysis,” J. Sport Health Sci., vol. 10, no. 6, pp. 620–628, Dec. 2021, doi: 10.1016/j.jshs.2021.09.004.
C. Levin, D. Rand, E. Gil, and M. Agmon, “The relationships between step count and hospitalisation-associated outcomes in acutely hospitalised older adults – A systematic review,” J. Clin. Nurs., p. jocn.16085, Nov. 2021, doi: 10.1111/jocn.16085.
D. R. Bassett, L. P. Toth, S. R. LaMunion, and S. E. Crouter, “Step Counting: A Review of Measurement Considerations and Health-Related Applications,” Sports Med., vol. 47, no. 7, pp. 1303–1315, Jul. 2017, doi: 10/gbh98n.
U. A. R. Chaudhry, “The effects of step-count monitoring interventions on physical activity: systematic review and meta-analysis of community-based randomised controlled trials in adults,” p. 16, 2020.
J. Montes, R. Tandy, J. Young, and S.-P. Lee, “Step Count Reliability and Validity of Five Wearable Technology Devices While Walking and Jogging in both a Free Motion Setting and on a Treadmill,” p. 17, 2020.
S. Mazilu, U. Blanke, and G. Troster, “Gait, wrist, and sensors: Detecting freezing of gait in Parkinson’s disease from wrist movement,” p. 6, 2015.
H. Lee, S. Choi, and M. Lee, “Step Detection Robust against the Dynamics of Smartphones,” Sensors, vol. 15, no. 10, pp. 27230–27250, Oct. 2015, doi: 10/gnh2s4.
M. Karas, M. Stra̧czkiewicz, W. Fadel, J. Harezlak, C. M. Crainiceanu, and J. K. Urbanek, “Adaptive empirical pattern transformation (ADEPT) with application to walking stride segmentation,” Biostatistics, vol. 22, no. 2, pp. 331–347, Apr. 2021, doi: 10.1093/biostatistics/kxz033.
M. Karas, J. K. Urbanek, V. P. Illiano, G. Bogaarts, C. M. Crainiceanu, and J. F. Dorn, “Estimation of free-living walking cadence from wrist-worn sensor accelerometry data and its association with SF-36 quality of life scores,” Physiol. Meas., vol. 42, no. 6, p. 065006, Jun. 2021, doi: 10/gj8wbs.
“Moving Average Vector Magnitude Step Algorithm (V1),” ActiGraph, LLC, Pensacola, FL, Sep. 2016.
S. Bagui et al., “An improved step counting algorithm using classification and double autocorrelation,” Int. J. Comput. Appl., vol. 44, no. 3, pp. 250–259, Mar. 2022, doi: 10.1080/1206212X.2020.1726006.
R. Delgado-Gonzalo et al., “Physical activity profiling: Activity-specific step counting and energy expenditure models using 3D wrist acceleration,” in 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Aug. 2015, pp. 8091–8094. doi: 10/gpn5sj.
R. Femiano, C. Werner, M. Wilhelm, and P. Eser, “Validation of open-source step-counting algorithms for wrist-worn tri-axial accelerometers in cardiovascular patients,” Gait Posture, vol. 92, pp. 206–211, Feb. 2022, doi: 10/gpn5r9.
A. Rai, K. K. Chintalapudi, V. N. Padmanabhan, and R. Sen, “Zee: zero-effort crowdsourcing for indoor localization,” in Proceedings of the 18th annual international conference on Mobile computing and networking - Mobicom ’12, Istanbul, Turkey, 2012, p. 293. doi: 10/gf296s.
F. Gu, K. Khoshelham, J. Shang, F. Yu, and Z. Wei, “Robust and Accurate Smartphone-Based Step Counting for Indoor Localization,” IEEE Sens. J., vol. 17, no. 11, pp. 3453–3460, Jun. 2017, doi: 10.1109/JSEN.2017.2685999.
J. M. Bland and D. G. Altman, “Measuring agreement in method comparison studies,” p. 26.
J. J. Chow, J. M. Thom, M. A. Wewege, R. E. Ward, and B. J. Parmenter, “Accuracy of step count measured by physical activity monitors: The effect of gait speed and anatomical placement site,” Gait Posture, vol. 57, pp. 199–203, Sep. 2017, doi: 10.1016/j.gaitpost.2017.06.012.
C. Tudor-Locke et al., “How fast is fast enough? Walking cadence (steps/min) as a practical estimate of intensity in adults: a narrative review,” Br. J. Sports Med., vol. 52, no. 12, pp. 776–788, Jun. 2018, doi: 10.1136/bjsports-2017-097628.

Download PDF

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Performance analyses of step-counting algorithms using wrist accelerometry

Status:

Version 1

Abstract

Figures

1 Introduction

2 Methods

3 Results

4 Discussion

Declarations

References

Status:

Version 1