The overall results point to this being a feasible intervention and study design to conduct a full-scale randomized controlled trial. The study indicates a promising recruitment rate though somewhat low retention rates, providing an important guideline for how many participants should be invited to reach a target sample size. Engagement is satisfactory with decent adherence and high app engagement ratings. Acceptability metrics are overall very promising, though the quality of prompts needs improvement. Measurement quality is good overall with a high completion rate and substantial within-person variability. A few adjustments are recommended to further refine the intervention and study protocol before conducting an RCT.
Recruitment and retention: 17.8% of invited participants chose to take part in the study, indicating that roughly 1 in 5 of invited people will join the study. This recruitment rate is reflected in another study population – Kowalski et al., [26] conducted a similar study in which 24% of invited participants were recruited – so recruitment rates around this number seem to be consistent in these kinds of studies.
Importantly, however, retention rates drop off quite markedly, especially for the follow-up measure which was completed by 44% of participants. Given the relatively low retention rate, an improvement in the study protocol would involve mitigating this drop-out effect. This could be done for instance through reminder e-mails and general encouragement to keep participating in the study. Notably, the follow-up retention rate observed in this study may be especially low because the follow-up measure coincided with vacation when participants may have been less inclined to answer.
Finally, rather than viewing recruitment and retention rates only as problems to be solved (though efforts should certainly be made to maximize recruitment and retention), these numbers provide an important benchmark for how many participants need to be invited to reach a target sample size. Based on the results of the current study, we need to recruit 10–12 times as many people as are needed in a final statistical analysis. Given that the planned RCT has wide inclusion criteria as well as a flexible, continuous, and automated recruitment process, it is absolutely feasible to invite sufficiently large numbers of participants.
Randomization algorithm: One problem with the data collection procedure was an uneven distribution of participants in the different intervention groups (N = 7, N = 5, N = 4). The randomized block sizes (3, 6, 9) were too large for the number of participants resulting in an imperfect distribution. It is important for future data analysis that groups are evenly distributed, for which reason we will adjust block sizes accordingly.
4.1.2 Engagement – The intervention had satisfactory adherence and received very high app engagement ratings.
Adherence varied greatly among participants with some using it daily and others using it 4–5 times, averaging a 51% adherence rate (14.3 out of 28 days). Importantly, engagement with digital interventions is a widespread challenge with many studies reporting low adherence rates [33, 41]. Adherence also varies widely depending on the type of intervention, making it difficult to interpret a given rate without sufficient reference to the specific context.
Given the context of this study and project, a 51% adherence rate is deemed satisfactory. Firstly, it is an improvement over a prior version of the intervention which had an average adherence of 39% [26]. In addition, DIARY is an unguided intervention that users engage with wholly on their own terms; instructions even state that one may use the intervention exactly as often as one likes. Compared with guided interventions – for example receiving support from a therapist and having an outspoken treatment plan – unguided interventions typically show lower adherence [36, 37].
Lastly, adherence rates could likely be further improved based on findings from a previous study investigating user engagement with DIARY [26]. For instance, involving employers to encourage intervention use and increasing use intention among participants are additional measures that could be included in the study protocol to increase engagement. Other studies suggest that tailoring and social influence are key factors for promoting engagement, something that may be included in future versions of DIARY [5, 65].
Results from the App Engagement Scale indicate that the intervention and app design are sufficiently user-friendly and engaging to participants. The App Engagement Scale had a mean score of 4.36 out of 5, which is a very positive rating [3]. This questionnaire regards the user experience of the mobile application, asking if users find it easy, enjoyable, and motivating to use. Notably, this score is a substantial improvement over the rating 3.44 observed for a prior version of the intervention using a different digital tool [26].
4.1.3 Acceptability – Participants found the intervention overall acceptable and technically stable, though the prompts need improvement.
Perceived effectiveness: Single-item acceptability metrics indicate that participants were overall satisfied with the intervention and found it suitable. On average, participants found the intervention content “mostly relevant” and would “most likely” want to use such an app again in the future (see Table 2). These are both promising metrics, indicating that the content of the intervention is relevant to this population and that it was sufficiently well-designed and helpful that they would want to access it again.
Some ratings regarding the intervention´s perceived effectiveness were slightly lower: participants did not feel the prompts were very useful to them (2.82 out of 4) and did not wholly experience that the intervention helped them deal with challenges in life more effectively (4.55 out of 6). These results indicate that the prompts may need to be refined to be more helpful to participants.
One way of improving the prompts would be to base them on a well-established framework outlining a variety of effective strategies for optimizing recovery processes. A problem with the current prompts which became evident during development was that the underlying recovery “type” for each intervention version (social support, psychological strategies, physical activity) was too narrowly defined, resulting in prompts being quite repetitive and one-dimensional. In effect, the same recovery strategy was suggested repeatedly with minor modifications.
Rather than trying to isolate the “best” type of recovery strategy and center a whole intervention around this type of recovery, it may be more fruitful to recommend users a wide range of different recovery strategies. Most models of well-being include multiple components and needs, suggesting that multiple types of strategies may contribute to improving mental health [11, 46]. Including a wide repertoire of recovery strategies may thus be conducive for optimal recovery.
Providing a variety of different types of recovery strategies may be beneficial for other reasons as well. Firstly, it increases the likelihood of users finding a strategy that is possible to implement on a given day and that matches diverse lifestyles. Secondly, recovery may be most effective when it corresponds to current needs because different stressors require different types of recovery to optimally mitigate their negative effects [10]. Relaxation exercises might be helpful to unwind from a cognitively demanding day, while talking to a close friend is more appropriate if one experiences high emotional demands at work. A larger toolbox of recovery strategies makes it more likely that users will find strategies most beneficial to them at any given moment.
One way to include a varied and well-balanced set of recovery strategies grounded on a theoretical foundation would be to craft prompts based on the DRAMMA framework [39]. This framework integrates various models of recovery and well-being, outlining six different types of experiences during leisure time that support mental health: detachment, relaxation, autonomy, mastery, meaning, and affiliation. Interventions using this model have been found to be effective in improving relevant outcomes in a working population [60]. By developing prompts according to a well-rounded framework which includes a large variety of recovery strategies, it is more likely that prompts will be helpful to users and address a wider range of recovery needs.
Technical stability: Results also indicate that the intervention is overall technically stable with several participants not having any technical issues whatsoever. The few reported technical difficulties were very minor and did not cause users substantial issues. This is a clear improvement over a previous iteration of DIARY which had considerable technical problems [26]. These results are very promising, but even so, efforts will be made to mitigate any technical issues before future studies.
4.1.4 – Measurement quality: Very high completion rates, with a potential caveat. Daily measures show substantial within-person variability. Some outcomes may need to be changed to better answer the research questions.
Participants provided complete data on the measures they participated in, answering all items for all questionnaires in the measures they took part in. Although a 100% completion rate is considered excellent, it may also illuminate potential problems with the data collection procedure. Participants did not have the option of skipping any questions, and responses were not saved on the server until participants completed the entire questionnaire. Because of this only fully completed measures were registered, resulting in a 100% completion rate. It is possible that some participants stopped midway through the measure and so did not have their partial responses registered. The low retention rate may reflect that some participants, even though they partially answered a measure, were not registered as having completed the measure.
Because of the strict criteria for registering data – inability to skip questions and only registering fully completed measures – we may miss out on valuable data. One way to mitigate this issue is by loosening the criteria for collecting data, for instance by giving participants the option to skip questions. Additionally, one can adapt the data collection system so that partial data is registered in the database. This will likely lead to collecting more data, even if it is sometimes incomplete, and may have the added benefit of improving retention rates.
Another important question regarding the measurements regards whether the outcome measures are appropriate to fully understand the intervention effects. Most outcome measures proved relevant, however, based on suggested changes to the intervention, new outcome measures may be more appropriate. The Recovery Experience Questionnaire (REQ) does not capture all dimensions of the DRAMMA framework and may thus not provide information about all different types of recovery strategies (for instance, the dimensions Affiliation and Meaning are missing from this instrument). Instead, the DRAMMA-Q may be a better suited instrument to ensure we get a comprehensive picture of the various recovery strategies [27].
Within-person variability: The daily stress measure proved to be useful for measuring individual change over time, with within-person variability accounting for 58% of the observed variance. This indicates that measuring stress on a daily level is important to capture the experience of participants and may yield important insights into how stress fluctuates on a daily level. These insights can in turn be used to further improve interventions and other efforts to mitigate the negative consequences of stress.