Fitbit data show poor correlation with measures of activity and sleep among hospitalized general medicine patients

Background: Wearable devices such as Fitbits may provide important insights about hospitalized patients that include data on low activity and poor sleep. Monitoring this information could spur interventions to improve mobility and sleep which may reduce the adverse effects associated with hospitalization. However, there is a lack of studies assessing the accuracy of wearables in hospitalized medical patients. The purpose of our study was to determine the accuracy of Fitbit heart rate, sleep and physical activity in hospitalized medical patients. Methods: We conducted a prospective cohort feasibility study enrolling 50 medical inpatients at two hospitals providing them with a wrist-worn Fitbit Charge. Our main measures were Fitbit heart rate, sleep and activity data as well as nurse recorded heart rates, patient reported sleep, and nurse assessments of activity. Results: Of the 50 patients who consented to the study, 47 patients wore the devices. Comparing pairs of heart rate data from Fitbit and nurse recorded vital signs for the same minute, there were 261 pairs available for comparison. The mean difference was 0.45 bpm (SD: 13.0, Pearson correlation: 0.68 P<0.001) and the 95% limits of agreement were -25 to 26 bpm. The association between the patient-reported sleep score and Fitbit total sleep duration was 0.19 (P=0.24) and between the self-reported hours of sleep and Fitbit total sleep duration was 0.21 (P=0.21). The correlation between nurse-recorded activity and Fitbit daily steps was 0.06 (P=0.52). Conclusions: Fitbit heart rates correlated well with nurse-recorded heart rate but did not correlate well with nurse assessments of activity nor with patient self-assessment of sleep. This study highlights limitations of the accuracy of current wearable wrist-worn device algorithms in activity and sleep detection in patients in hospital. The ndings call into question the validity of Fitbits for assessment of patient activity and sleep in the hospital setting and suggest that they should not be routinely used without further validation.


Background
The post-hospital syndrome has been described as a condition after inpatient discharge when there is an increased risk of adverse events and re-hospitalization often for new and different medical conditions. 1 In addition to the original acute medical illness, possible contributors to post-hospital syndrome include factors such as poor sleep and low activity that occur during hospitalization. Sleep in the hospital has been found to be shorter, with more nocturnal awakenings, and earlier wake times. 2 The most sleepdisrupting factors were noise from other patients, medical devices, pain and toilet visits. Inpatients have been found to have very low physical activity with a median of 478 -1,158 daily steps, compared to a normal of 5,000 -6,000 steps for most individuals over 65. [3][4][5][6][7] Currently, inpatient assessments of measures like heart rate and activity are routinely done only several times per day or in the case of sleep not at all. 8,9 Affordable consumer wearable devices that continuously measure activity, sleep and vital signs may provide valuable information that could improve the care of inpatients. These devices could provide more frequent monitoring to detect deterioration earlier and avoid failures to rescue. 10 Increased awareness of patients' low activity or poor sleep may prompt interventions to improve these parameters, potentially reduce rehospitalizations and ultimately help reintegrate patients into the community.
Before adopting wearables to measure heart rate, activity and sleep in routine hospital care, it is critical that they are tested to ensure accuracy and reliability. While there has been signi cant testing of these devices in the healthy and ambulatory population, it is important that the algorithms used to process accelerometer and other sensor data into interpretations of sleep, heart rate and activity are validated in the inpatient setting. Equipment such as walkers, assistive devices, indwelling lines or other hospital devices may affect with the assumptions inherent in wrist-based activity monitor algorithms. 11 The Fitbit wearable device (Fitbit, Inc) has been studied in the critical care setting where heart rate has been found to have moderate accuracy in patients who are in sinus rhythm, and Fitbit sleep data correlated moderately with self-reported sleep. 12,13 To date, there have been limited studies on the accuracy of wearables in hospitalized medical patients.

Methods
Objective: To determine the accuracy of heart rate, physical activity and sleep by wearable devices in hospitalized medical patients.
Design: Prospective cohort feasibility study (Clinicaltrials.gov NCT03646435) Participants: Patients were recruited from two major teaching hospitals-Toronto General Hospital and Toronto Western Hospital -of the University of Toronto that both have General Internal Medicine services. The General Internal Medicine patients of the University of Toronto hospitals are known to have a median age of 73, a median of 6 comorbidities and are admitted with a wide variety of medical diagnoses. 14 Subject inclusion criteria were English-speaking adults (aged 18 years or older) admitted to a General Internal Medicine service and who were able to provide consent. Subject exclusion criteria: 1) patients in whom routine vital sign monitoring was not indicated, such as those with palliative, comfort-oriented goals of care, 2) patients with signi cant cognitive impairment limiting their ability to complete surveys, and 3) patients at risk of vascular compromise of the arm on which the wearable device was to be placed (e.g., dialysis stulas, peripherally inserted central catheters). Patients isolated under contact precautions were also excluded to reduce the risk of transmitting infection. A convenience sample size of 50 participants was chosen for this pilot study similar to a recent feasibility wearable study. 13 The Research Ethics Board of University Health Network approved the study (ID# 18-5621). All participants provided written informed consent.
Intervention: Participants wore either the Fitbit Charge 2 or Charge 3, wrist bands that can measure activity, sleep and heart rate (cost $150-160 CAD). Fitbits were selected due to their popularity, acceptability and battery life of up to 5 days and use in research studies. [15][16][17] Participants were asked to wear the band on their wrists continuously until discharge or for up to one week, whichever came rst.
Data Collection: The following information was collected from each participant at enrollment: age and sex. The following information was collected after enrollment: most responsible diagnosis, comorbidities, discharge status (alive, left against medical advice or died in hospital), discharge location (home, home with support services, rehabilitation facility, or supportive housing), length of stay, duration of wearing device, and documentation of atrial brillation in previous documented comorbidities or by any ECG performed during the visit.
For each participant we collected the following information during the time of study participation: daily sleep questionnaire completed by patients, heart rates documented by nursing staff as part of vital signs, activity assessments recorded daily by nurses, and Fitbit recordings for heart rate, steps and sleep summary data . At both hospitals involved in this study, vital signs are measured by nurses and are manually entered into the electronic health record.
We collected daily sleep quality from patients using the Richards-Campbell Sleep Questionnaire (RCSQ), a validated survey instrument for measuring sleep quality in hospitalized patients. 18 This survey uses 0-100 mm visual analog scales to assess sleep depth, latency, awakenings, percentage of time awake, and overall quality of sleep. Item values are summated and divided by 5 providing a mean score between 0-100 re ecting the patient's perception of their sleep quality with higher values representing better sleep.
Participants were also asked to report the total hours of sleep they experienced.
As part of routine practice, nurses twice daily recorded their patients' activity levels on the following scale: 1. Bedfast (con ned to bed) 2. Chairfast (ability to walk severely limited or non-existent) 3. Walks occasionally (walks occasionally during day, but for very short distances, and spends majority of each shift in bed or chair) 4. Walks frequently (walks outside the room at least twice a day and inside room at least once every two hours during waking hours) Fitbit data was transmitted from the wearables via Bluetooth to a mobile device using the Fitbit app and then uploaded to Fitbit servers. Fitbit heart rate, sleep, and activity data were recorded minute-by-minute and downloaded from Fitbit servers using a custom, in-house Python program using the Fitbit application programming interface. 19 Data Processing: To process data from Fitbits, determining wear time can be done through rules based on activity or heart rate data. 20 Since hospitalized patients are known to have low mobility, we used a heart rate-based algorithm instead of a steps-based algorithm. 2016 With heart rate-based algorithms, devices are assumed to be not worn if signi cantly less than usual heart rate recordings are recorded. 20 Typically, when the device was worn consistently, there would be 50-60 heart rate recordings per hour. We assumed the device was not worn when there were less than 30 heart rate recording per hour. This was to avoid incorrectly considering the device being on the patient due to known problems of occasional spurious heart rate measurements that can be recorded even when devices are not worn. 21,22 To calculate sleep episodes, the three Fitbit sleep stages ('deep', 'light', and 'REM') were summed to provide the total sleep time per episode.
Analysis: Continuous demographic variables were analyzed using descriptive methods; categorical variables using contingency tables and histogram plots. Primary endpoints were the correlation between the following measures: 1) Nurse-reported vital signs matched using the timestamp documented in the electronic health record to the closest Fitbit heart rate data point within a one-minute tolerance, 2) Fitbit sleep and the RCSQ completed by patients for the same night, and 3) Fitbit activity and nurse-recorded patient activity over the same day. Because atrial brillation reduced the accuracy of Fitbit heart rates in another study, 12 we conducted a prede ned subgroup analysis for heart rate in patients who did or did not have atrial brillation. Atrial brillation was determined based on patient history and ECGs performed during admission. Statistical analyses were performed using Python SciPy libraries. 23

Results
From June 2019 to Oct 2019, a total of 50 patients were enrolled. Patient demographics are shown in Table 1. Of the 50 patients enrolled, 47 patients had Fitbit heart rate data. The other three had no heart rate data and were assumed to not have worn the device. The main reason provided by patients why they did not wear the wrist-worn Fitbit device was that it was uncomfortable. Of the 47 patients who wore the band, 40 appeared to wear it continuously. Six patients took off the band for one extended period of time (median 8 hours). One patient took off the band for two extended periods, each night for two nights (4 hours and 14 hours). Median time worn was 50 hours (IQR 29 -69 hours).

Heart Rate
For the 47 patients, there were on average 3,028 heart rate recordings per patient (SD 1,785) from the wrist-worn device. The average heart rate was 83 bpm (SD 14).
Comparing pairs of Fitbit and nurse recorded heart rates, there were 261 pairs available for comparison.

Activity
From wearable data, patients took a median of 1,150 steps per day (IQR 366 -2,179). For 45% of the days, patients took less than 1,000 steps (Appendix Figure 1). There were three occurrences of very high daily steps (over 9,000 steps) in 3 different patients. Figure 2 shows when steps were taken through the day. A peak of steps occurred in the morning with activity mainly occurring between 8 am and the early evening. Table 2 shows the distribution of daily steps as recorded by the wrist-worn device for each activity level. The correlation between nurse recorded activity and daily steps was 0.06 (P=0.52) and did not improve with removal of the three outliers mentioned above. Sleep A summary of the sleep data is shown in Table 3. Median sleep duration as recorded by the wearable was 5.0 hours (IQR 2.9-7.1). There were 4 users who did not wear the Fitbit overnight. There were 3 users who wore the Fitbit overnight but did not have any recorded sleep periods. Of the 47 participants who completed the RCSQ, the average score was 52 (IQR 32-64). Patients reported median sleep duration of 5.3 hours. The association between the RCSQ score and total sleep duration by Fitbit was 0. 19 (P=0.24) and between the self-reported hours of sleep and wearable total sleep duration was 0.21 (P=0.21). Aggregating Fitbit sleep data from all users, there was a peak of sleeping activity between 11 pm and 6 am and minimal sleeping between 11 am and 2 pm (Figure 3).

Discussion
This is one of the rst and largest studies to examine the correlation between heart rate, sleep and activity recordings from the Fitbit wearable device with concomitant clinician and patient observations in hospitalized general medicine patients. Overall, wearable-derived heart rate measurements correlated reasonably well with nurse-recorded values but did not correlate well with nurse assessments of activity nor with patients' self-assessment of their sleep.
Fitbit-derived heart rate measurements correlated modestly with nurse-recorded vital signs (R = 0.69). This nding is similar to a previous Fitbit study found that heart rate correlated well with ECG measurement (R = 0.74) in ICU patients. 12 As continuous ECG measurement was not widely available in the inpatient ward setting, our study compared Fitbit data to nurse-recorded vital signs. In our hospitals, nurses must write down the time that vital signs were taken and manually enter this information into the electronic health record; this may have led to greater discrepancy in the recorded and actual timing of vital sign measurement, which may partially explain the lower correlation we observed. General medicine patients are also more mobile than ICU patients, which may also have increased the variability of heart rate measurements and resulted in lower correlation between nurse-recorded values and Fitbit data.
A small number of patients in our study had step counts that showed nearly 10,000 steps per day which could equate to eight kilometers depending on stride length. This is implausible in our inpatient population. Potential errors include Fitbit devices misinterpreting wheelchair, stretcher, or arm movements (such as tremors) as steps. Our results showed no correlation (R = 0.06, P= 0.52) between Fitbit-measured activity and nurses' assessments. While Fitbits step counts have been found to be accurate in the 'freeliving' and laboratory settings in a systematic review of 67 studies, 15 wrist-worn activity trackers such as Fitbits have been found to have only moderate agreement to activity measured by ankle-worn devices in the inpatient rehabilitation setting and post-cardiac surgery. 24,25 This appears to be due to decreased ability to identify steps at lower gait velocity because of dependence on data recorded at the wrist, to shorter stride lengths and to the use of walkers. 26 However, an alternative explanation for the lack of correlation between Fitbit activity data and nurses' activity assessments in our study is an inability of the nurses' "subjective overall impression" method to capture accurate information about patient activity. While routine nursing assessments have been found to correlate with clinical outcomes, these assessments have not been validated in rigorous prospective studies. 27 Indeed, there is more evidence on the accuracy of Fitbits compared to nurse activity assessments. Further study, using objective tools such as ankle-worn devices, is needed to validate wrist-worn devices for activity monitoring in the hospitalized inpatient setting. However, all studies were either in the participant home or a sleep lab, and none were in hospitalized patients. Indeed, our results were similar to results found in ICU patients (R = 0.33, P=0.03) with selfreported sleep. 13 Since the RCSQ has been found to correlate moderately well with polysomnography in ICU patients 18,29 , the discrepancy between our ndings and the previous results with Fitbits may suggest that sleep detection algorithms of Fitbits may be less accurate for hospitalized patients. Patients' conditions and the hospital environment may reduce the accuracy of Fitbit sleep algorithms. Fitbit algorithms use heart rate and activity to determine sleep episodes, and acutely ill patients in hospital may have different heart rate variability and activity than healthy people at home. Similarly, the noisy hospital environment is quite different from home or lab setting, and the algorithms were found to be inaccurate with disrupted or abnormal sleep, which is highly prevalent in hospital. 9,30 , As this was a pragmatic pilot study evaluating the use of Fitbit devices on the General Internal Medicine ward, the major limitation to this study is the lack of robust gold standard methods in this setting. Ideally, heart rate would be also measured by two other methods including ECG and would have continuous readings. 31 The time entered for heart rate by nurses may not re ect the true measurement time and would miss the minute to minute variation in heart rate. Ideally, sleep would be measured by polysomnography, and activity would be measured by ankle-worn accelerometers. 24 Future studies should incorporate robust gold standard measures. Our algorithm to determine if a Fitbit was worn may have been too conservative and may have missed episodes where the Fitbit was actually worn. However, this was done to increase the speci city of our measurements. Finally, very high numbers of steps were unexpectedly found and the potential causes were not determined in this study. Future studies could look at what contributes to unexpectedly high step counts.
The ideal device to help address post-hospital syndrome would be inexpensive, easy to use and accurate. While the Fitbit devices that we used were relatively inexpensive and most patients wore them, there were issues in sleep detection and activity recognition for hospitalized patients. Further research is needed to know when wearables measures are accurate and when they are not in hospitalized patients. Ultimately, device algorithms will likely improve further; improvements in sleep stage detection have already been demonstrated in newer devices compared to the previous generation. 28 As existing algorithms for determining sleep and steps were not developed using data from hospitalized patients, it is not surprising that these devices are less accurate in the hospital setting. Further re nement of these devices in collaboration with device manufacturers is necessary. Current devices may still be useful to help measure changes due to sleep or mobility interventions, but proper validation is still necessary to determine these changes from baseline are accurate and meaningful. 32 Finally, there is strong potential for affordable accurate devices both to help manage patients in the hospital but also in the transition to home to help reduce readmissions.

Conclusions
Overall, wearable-derived measurements correlated reasonably with nurse-recorded heart rate but did not correlate with nurse assessments of activity nor with patient self-assessment of sleep. This study highlights limitations of the accuracy of current wearable wrist-worn device algorithms in activity and sleep detection in hospitalized patients. This study calls into question the validity of Fitbits for assessment of patient activity and sleep in the hospital setting and suggests that they should not be routinely used without further research and validation.  Bland Altman plot of nurse-recorded heart rate and Fitbit heart rate Steps taking throughout the day by all patients