Participants and procedure
The study used data from the MyMovez project [32] which are stored at Data Archiving and Networked Services in the Netherlands (https://doi.org/10.17026/dans-zz9-gn44). The project used a longitudinal design and investigated healthy lifestyles of adolescents in the Netherlands. In the project, 17 primary school and 11 secondary school classes participated in one or more of the seven data collection waves spread over 3 years (for an overview of the measurement periods, see Table 1). Participants received the MyMovez Wearable Lab for seven consecutive days, which consisted of a research smartphone with the corresponding MyMovez app and a wrist-worn accelerometer. On the smartphone, participants randomly received daily questionnaires (between 7:00h and 19:30h). To make participating more fun, participants were allowed to play a game for five minutes per hour and could chat with each other on the social media platform ‘Social Buzz’ during the last three waves of the project. The accelerometer was used to measure the amount and intensity of physical activity per minute.
In these three years, a total of 1484 individual adolescents participated in 130 classrooms. The participants were between 8 and 17 years old (MAge = 11.23, SDAge = 1.74, 46.11% male). For an overview of the number of participants and dates per wave see Table 1. The hypotheses, study design, sample, data collection procedure, measured variables, and plan of analysis were preregistered on the Open Science Framework (OSF, https://osf.io/64gw5/?view_only=0c6894d63eeb4cb5aa01d91d87085c56 [masked for blinded peer review]). A subset of the data (wave 4) was used to determine the syntax for the analyses and was excluded from the analysis presented in this paper.
Table 1
Overview of the measurement periods
Wave
|
Number of Participants
|
Start
|
Finish
|
1
|
843
|
27-01-2016
|
09-03-2016
|
2
|
901
|
31-03-2016
|
17-05-2016
|
3
|
868
|
01-06-2016
|
22-06-2016
|
4
|
744
|
15-02-2017
|
28-03-2017
|
5
|
1017
|
02-02-2018
|
17-04-2018
|
6
|
755
|
13-04-2018
|
06-06-2018
|
7
|
745
|
16-05-2018
|
04-07-2018
|
Note.Start is the first day of the first participants and Finish is the last day of the last participants.
Measures
Physical activity. Participants wore the accelerometer (Fitbit Flex™) on their non-dominant wrist for the length of the measurement period. Per day, data were gathered between 7:00h and 19:30h. Incomplete days of data (< 750 minutes of accelerometer data) were excluded from the data. Potential reasons for partial data were that the materials were distributed or collected at the schools, empty batteries, or non-wear time of the accelerometer [33][Masked for review]. Therefore, the maximum number of physical activity data was five days per wave. The amount of physical activity was expressed by the number of steps per day (or per minute). Participants accumulated on average between 1,163.00 and 21,032.50 steps per day (MSteps = 8,931.15, SDSteps = 2,941.81, MdnSteps 8,570.18).
Subsequently, the number of steps per day per participant were averaged to create a variable for the mean score of physical activity per participant. The mean score per participant was used to represent the physically active lifestyle of the participant and was used to test a between-subjects effect of physical activity on happiness. Also, for each number of steps per day, the deviation in this average number of steps was calculated to represent relative active or inactive days compared to the average behavior of the participant. This relative number of steps of the participant was used to test the within-subjects effect of physical activity on happiness.
Happiness. At random moments during the day, participants received an experience sampling question that measured happiness at that specific moment in time: “Indicate on the line below how happy you are at this moment”. Participants responded to the question on a visual analogue scale by placing their finger on a slider ranging from 0 (“very unhappy”) to 100 (“very happy”), see Figure 2. On average, participants scored rather high on this question (M = 75.82, SD = 16.52, Mdn = 76.24).
In addition, eudaemonic well-being was measured once per wave to assess the criterion validity (concurrent validity) of the experience sampling question for happiness. Eudaemonic well-being was measured by using the Faces Scale [34], a single-item measure that asks participants “Overall, how do you usually feel?” Participants responded by selecting one of the drawings of faces, arranged in a horizontal line, ranging from 1 ( “very happy”) to 7 (“very unhappy”, see Figure 2). The values were recoded, so an increase on the scale represents an increase in subjective well-being. Again, participants scored on average rather high on the well-being scale (M = 5.58, SD = 1.09). A Bayesian correlation test (r = 0.30, BF01 = 57.44 ± 0) indicated very strong evidence for a moderate correlation between the two measures. Therefore, we concluded that eudaemonic well-being and the happiness measure were related and deemed that the ESM happiness question was a valid measure of happiness.
Covariates. Participants indicated their sex and age at the start of the project. Both were included as covariates because males tend to be more physically active than females, and younger adolescents are more physically active than older adolescents [35]. Also, per day we coded whether this was a week- or a weekend day to control for differences in physical activity [36].
Strategy of analysis. All data were handled and analyzed in R [37]. To test the hypotheses, Bayesian mixed-effects models were performed by using the BayesFactor package [38], and Jeffreys-Zellner-Siow priors were used. In each analysis, two models were tested against each other: The null model included the dependent variable, the covariates (i.e., sex; age; weekend), and random intercepts per participant and wave and this model was compared to a second model that included the predictor of interest. The outcome of this comparison was the amount of evidence for support for the second model (with the predictor of interest) over the null model (without the predictor of interest), expressed as the Bayes factor. The Bayes factor is the relative strength of evidence for the alternative hypothesis over the null hypothesis. Values greater than 1 indicate evidence for the alternative hypothesis and the higher the value of the Bayes factor, the higher the likelihood of the hypothesis. In contrast, values lower than 1 indicate increasing evidence for the null hypothesis over the alternative hypothesis. When there is no support for either of the two hypotheses, the Bayes factor is close to 1 [39]. For inference of the strength of the support for the hypotheses, the classification of Jeffreys [40] was used.
To test the first set of hypotheses, the day-to-day data were used. The null model of these hypotheses included the sex and age of the participant and the weekend variable as predictors of happiness. In addition, random intercepts per participant and wave were added to the mixed-effects model to account for the clustering of data per participant and wave. Per hypotheses, an alternative model was created by including the predictor of interest. For H1a, the number of steps was included as the predictor. For H1b, the average number of steps per participant was included as the predictor to test the between-subjects effect. For H1c, the number of steps deviating from the mean per participant was included as the predictor to if relative changes in physical activity predict changes in happiness.
To test the second set of hypotheses, a subset of the data was selected which met several criteria. The happiness measure was not missing, the physical activity data of the previous day was not missing, and the physical activity data of the subsequent day was not missing. By default, all happiness measures on the first day and last day of the measurement period were excluded because they did not meet the criteria of having both the previous and subsequent days of physical activity data available. After exclusion, 1673 observations (24.43%) were retained and the subsample consisted of 718 participants (48.61 % of the sample). This subsample was slightly younger (M = 10.73, SD = 1.70) and the percentage of males was lower (42.90%) than in the total sample. The same specifications for the null model and approach were used as in the first set of hypotheses. For H2a, the number of steps on the previous day was used as a predictor of happiness. For H2b, happiness was used as a predictor of the number of steps on the subsequent day. And for H2c, the number of steps on the previous day and happiness were used as predictors of the number of steps on the subsequent day.
To test the third set of hypotheses, the same approach was used as for the previous set of hypotheses, with the modification that the number of steps on the previous and subsequent days was changed to the number of steps during the previous and subsequent hour of the participant’s response to the happiness question. Again, a subset of data was used for which the number of steps during the two hours surrounding the happiness measures was available. If no steps were recorded during these particular two hours, the measure was excluded from the analysis. After exclusion, 8498 observations (84.88%) were retained, and the subsample consisted of 1164 participants (78.44% of the participants). Again, the subsample was slightly younger (M = 10.73, SD = 1.70), but the percentage of males was comparable to the original sample.