Recent growth in remote studies has shown the effectiveness of digital health technologies in recruiting and monitoring the health and behavior of large and diverse populations of interest in real-world settings. However, retaining and engaging participants to monitor their long-term health trajectories has remained a significant challenge. Uneven participant engagement combined with attrition over the course of the study could lead to imbalanced study cohort and data collection, which may severely impact the generalizability of real-world evidence.
We report findings from long-term participant retention and engagement patterns in a multinational remote digital depression study with up to two years of real-world behavior monitoring. In total, real-world engagement data from 614 participants with 14,964 surveys and 135,014 days of phone passive and wearable (Fitbit) data were analyzed using survival and unsupervised clustering methods. A considerable proportion of participants (N=415; 67.6%) were retained during the first 43 weeks of the study. Clustering participants’ long-term usage data of study apps and wearables showed three distinct subgroups with different engagement levels (most, middle, and least). Notable findings comparing participants' characteristics across these subgroups were: 1.) Participants (33.2%; N= 204) with the highest baseline depression severity (4 points higher PHQ8 score, p < .01) were in the least engaged group (median bi-weekly surveys completed = 4) compared to the most engaged group (37.6%; N = 231) that on average completed 20 bi-weekly surveys. 2.) A considerable proportion (44.6%; N = 91) of participants in the least engaged group still contributed wearable data for up to 10 months. 3.) The participants in the least engaged group also took significantly longer in responding to surveys in naturalistic settings (3.8 hours more, p < .001) and were younger (age difference = 5 years, p < .01) in comparison to participants in the most engaged group.
Our findings show various factors such as socio-demographics, app usage behavior, and depression severity can be linked to the long-term retention and density of real-world data collected in remote digital research studies. Finally, passive data gathered from wearables without additional participant burden showed advantages over active survey data for long-term monitoring, providing greater contiguity and duration of data collected. Together these findings could inform the design of future remote digital health research studies to enable equitable and balanced health data collection from diverse target populations.