Using machine learning methods to predict physical activity types with Apple Watch and Fitbit data using indirect calorimetry as the criterion.

Background There is considerable promise for using commercial wearable devices for measuring physical activity at the population level. The objective of this study was to examine whether commercial wearable devices could accurately predict lying, sitting, and intensity level of other activities in a lab-based protocol. Methods We recruited a convenience sample of 49 participants (23 men and 26 women) to wear three devices, an Apple Watch Series 2, a Fitbit Charge HR2, and and iPhone 6S. Participants completed a 65-minute protocol consisting of 40 minutes of total treadmill time and 25 minutes of sitting or lying time. Indirect calorimetry was used to measure energy expenditure. The outcome variable for the study was the activity class; lying, sitting, walking self-paced, and running 3 METs, 5 METs, and 7 METs. Minute-by-minute heart rate, steps, distance, and calories from Apple Watch and Fitbit were included in four different machine learning models. Results Our dataset included 3656 and 2608 minutes of Apple Watch and Fitbit data, respectively. We tested decision trees, support vector machines, random forest, and rotation forest models. Rotation forest models had the highest classification accuracies at 82.6% for Apple Watch and 89.3% for Fitbit. Classification accuracies for Apple Watch data ranged from 72.5% for sitting to 89.0% for 7 METs . For Fitbit, accuracies varied between 86.2% for sitting to 92.6% for 7 METs . Conclusion This study demonstrated that commercial wearable devices, Apple Watch and Fitbit, were able to predict physical activity types with a reasonable accuracy. The results support the use of minute-by-minute data from Apple Watch and Fitbit combined with machine learning approaches for scalable physical activity type classification at the population level.


Introduction
The introduction of commercial wearable devices for physical activity monitoring has been an exciting development with the potential to increase physical activity at the population level. 1,2 We define commercial wearable devices as those used primarily by individual consumers for physical activity monitoring rather than for research purposes. 1 Research examining commercial wearable devices has primarily focused on two areas. First, examining the reliability and validity of the measures that the devices provide, including step counts, heart rate, and energy expenditure. 3−5 A systematic review found that wearable devices were accurate for tracking step count, but may be less accurate for measuring energy expenditure or heart rate. 6 The second primary research area for commercial wearable devices is how available measures, particularly steps, from commercial devices, translate to current physical activity recommendations.
For example, Tudor-Locke et al., 2011, found that approximately 8000 steps/day is a good proxy for 30 minutes of daily moderate to vigorous physical activity (MVPA) and 7000 steps/day, seven days a week is consistent with obtaining 150 minutes of weekly MVPA. 7,8 Despite the promising research examining commercial wearables, at least two important areas remain unexplored. First, commercial wearable devices tend to focus on step counts as a user goal, rather than other factors such as sedentary behaviour, and physical activity intensity which are also movement indicators of health. Yet, current physical activity guidelines are based on minutes of moderate to vigorous physical activity. 9 Second, commercial wearable devices use proprietary methods for estimating steps, heart rate, and calories, sleep, sedentary behaviour, and physical activity. Proprietary methods are unknown and make standardization between different commercial wearable devices difficult or impossible.
The purpose of this study is to examine whether commercial wearable devices (Apple Watch and Fitbit) can accurately predict sedentary behaviour and light, moderate, and vigorous physical activities and to develop indicators to assist in predictions for future studies. We hypothesize that commercial wearable devices will accurately predict moderate and vigorous physical activity, but may not differentiate well between light and sedentary behaviour. As a secondary objective, we examine whether accounting for the type of device could improve classification results. If device type is an important feature for classification, this may be an important first step in standardization between devices.

Participants
We recruited 49 participants (23 men and 26 women) to use three devices, an Apple Watch Series 2, a Fitbit Charge HR2, and an iPhone 6S. We chose Apple Watch and Fitbit for this study because they have the highest market share among wearable devices. 5

Study Design
Participants engaged in a 65-minute protocol with 40 minutes of total treadmill time and 25 minutes of sitting or lying time. The protocol was similar to previous studies testing the reliability and validity of different physical activity monitors. 12 These studies show that changing speeds is important to ensure devices can recognize intensity changes. Figure 1 shows the study protocol. The first two phases of the protocol involve sedentary activity (i.e., lying on a cot and sitting on a chair) for a duration of 5 minutes each. Following this, participants moved to the treadmill and select a "selfpaced" speed for 10 minutes. A 5-minute lying period followed. Participants then moved to the treadmill and walked at a pace of 3 METs for 10 minutes. Following the 3 MET treadmill activity, the participants lied on a cot for 5 minutes. Participants walked at an effort of 5 METs for 10 minutes, then had a 5-minute sitting period. Finally, each participant will complete a 10-minute period at 7 METs. The 5 minute rest periods were sufficient to lower participant heart rate and maintain steady state for these sedentary activities. 13 Additionally, the 10-minute treadmill periods are sufficient to estimate O 2 uptake at steady state during the specified activity (light -3METs, moderate -5 METs, vigorous -7 METs) for the 10-minute durations. The information from the metabolic cart was used to create an outcome variable with seven predicted classes; lying, sitting, self-pace walk, walking at 3 METs, walking/running at 5 METs and walking/running at 7 METs.
For each stage involving a specified MET value, a VO 2 to METs calculator was used to calculate the METs of each individual. METs vary per person per specified duration of activity. This variation is related to the lean body mass and other physiological factors such as health status and age. 14−21

Measures
The outcome variable for the study was activity classes based measures using Oxycon Pro metabolic cart. The study protocol included six classes; lying, sitting, walking self-paced, 3 METs, 5 METs, and 7 METs. The Oyxcon Pro has been shown to be a valid and reliable method for measuring energy expenditure. 22 The metabolic cart was calibrated according to manufacturer specifications every morning of data collection.

Analyses
Statistical analyses were performed using R (version 3.6.1) and Weka (version 3.8.3). Data were downloaded from the metabolic cart (Oxycon Pro, Jaeger, Hochberg, Germany). We used previously published methods to convert breath-by-breath data to second-by-second MET intensity estimates. 25 .
Analysis was conducted separately for Apple Watch and Fitbit. We first cleaned the data and used linear interpolation on steps, heart rate, calories, and floors climbed to impute missing data. Following this, we developed a feature set that included intensity (Karvonen Formula) 26,27 which calculates individualized target heart rate parameters, steps entropy, which is a measure of predictability of step count, and correlation coefficient between heart rate and steps 28 . We developed the features in order to consider multiple physiological characteristics that could explain sedentary, light, moderate, and vigorous physical activity (See Table 1).
We used four different classification methods, Random Forest 29,30 , Rotation Forest 31 , Support Vector Machine (SVM) 32 , and Decision Trees in our analysis. 33 Model accuracy was examined using k-fold cross-validation. Data were randomly split into 10 subsamples. For each subsample, classification algorithms were developed. Each algorithm was then used to predict the error associated with each one of the subsamples. A sum of prediction errors was calculated over all subsamples to produce a final prediction error rate. 34 In each model we included the features described in Table 1 and age, gender, height, and weight. We chose these models because SVM 35 and Random Forest models 36 are common in physical activity research using research grade accelerometers and Rotation Forest and Decision Trees are similar methods to Random Forest.
We evaluated model fit using accuracy, sensitivity, specificity, confusion matrices, and feature ranking. Finally, to answer our second research question we combined the Fitbit and Apple Watch data and added an additional feature, device type to see the difference between devices.

Results
Participants included 26 females and 23 males. The average age was 29.3 (min 18 -max 56). Table 1 shows mean and standard deviation values for continuous variables or count, and percent for categorical predictors for Apple Watch and Fitbit, respectively. The average height and weight were 1.69 m and 70.6Kg, respectively. Average heart rate was 91.1 for Apple Watch and 75.3 for Fitbit.
Average steps per minute were 181.4 and 7.7 for Apple Watch and Fitbit, respectively. Table 1 also shows the feature descriptions and descriptive statistics for each feature included in the Rotation Forest model.     DF conceptualized the paper. All authors assisted with data collection. DF, JRA, BS, AB, HL conducted data analysis. All authors contributed to writing the manuscript and approved the submitted version. Figure 1 65-minute lab-based activity protocol

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.