Reactivity and reproducibility of accelerometer-based sedentary behavior and physical activity in two measurement periods: Results of a feasibility randomized-controlled study

Background: The aims of the study were to investigate measurement reactivity in sedentary behavior (SB), physical activity (PA), and accelerometer wear time in two measurement periods, to examine the reproducibility of these outcomes and to quantify measurement reactivity as a confounder for the reproducibility of SB and PA data. Methods: A total of 136 participants (65% women, mean age = 54.6 years, study period 02/2015 to 08/2016) received 7-day accelerometry at baseline and after 12 months. Latent growth models were used to identify measurement reactivity in each period. Intraclass correlations (ICC) were calculated to examine the reproducibility using two-level mixed-effects linear regression analyses. Results: At both measurement periods, participants increased time spent in SB (b=2.4 min/d; b=3.8 min/d), reduced time spent in light PA (b=2.0 min/day; b=3.2 min/d), but did not change moderate-to-vigorous PA. Participants reduced accelerometer wear time (b=5.2 min/d) only at baseline. The ICC coecients ranged from 0.42 (95% CI=0.32-0.53) for accelerometer wear time to 0.74 (95% CI=0.68-0.79) for SB. In none of the regression models, a reactivity indicator was identied as a confounder for the reproducibility of SB and PA data. Conclusions: The results show that measurement reactivity differentially inuences SB and PA in two measurement periods. Although 7-day accelerometry seems to be a reproducible measure of SB and PA, our ndings highlight the importance of accelerometer wear time as a crucial confounder when using accelerometry in monitoring SB and PA, planning interventions, and analyzing SB and PA data.


Background
In order to determine the level of sedentary behavior (SB) and physical activity (PA), to understand their relationship with health, and to evaluate the e cacy of behavioral interventions, an accurate measure of SB and PA is required. 1 This is challenging because both behaviors are characterized by considerable inter-and intraindividual variability. 2 Day-to-day variability around a mean level is a natural part of SB and PA and is described as the habitual level of SB and PA of an individual. 3,4 This behavioral variability needs to be carefully studied because it has major implications for measurement and conclusions, drawn from data analysis. 4 Accelerometry is frequently used to assess SB and PA. The reproducibility of accelerometer-based measures might be subject to human-related sources of bias. 5 Participants can in uence accurate data measurement by changing their behavior when they know that they are wearing an accelerometer to assess habitual levels of SB and PA. 6 The motivation to be physically more active than usual due to wearing such a device or social desirability may be reasons for the presence of accelerometer measurement reactivity (AMR). 7 Particularly in behavioral intervention studies, potential effects of (repeated) assessments themselves may cloud effects of the intervention by improving self-monitoring and awareness of the behavior being measured, possibly resulting in behavior change in the intervention group as well as in the control group. 8, 9 So far, evidence on AMR among adults is inconsistent 8, [10][11][12][13][14][15] and mostly limited to young adults or small sample sizes. [11][12][13][14] Further, there appears to be only the study of Baumann and colleagues examining the in uence of AMR on SB and the entire intensity spectrum of PA.
The analysis was carried out using the same sample as in the present study but referred to baseline data only. 15 To our best knowledge, none of the studies investigating AMR has examined whether and to what extent AMR occurs in more than one measurement period, particularly in the context of a behavioral intervention. The effect of AMR on repeated assessments of SB and PA could result in an over-or underestimation of baseline values and therefore, in di culties in detecting changes in SB and PA outcome variables in the long term.
The vast majority of studies applied a 7-day accelerometry protocol to determine habitual levels of SB and PA. 16 However, this approach raises the question whether this length of monitoring protocol is su cient to re ect the mean level on the entire intensity spectrum from SB to moderate-to-vigorous PA (MVPA). Reliable data depend on the variability within a person's daily activity pattern, 3 especially for more than one measurement period. 17 Moreover, the potential impact of AMR on the reproducibility of SB and PA data was not investigated so far. Therefore, the present study has two aims. First, to examine AMR over two measurement periods indicated by systematic changes in SB, light PA (LPA), MVPA, and accelerometer wear time. Second, to investigate the reproducibility of these outcomes and to quantify whether AMR should be considered as a relevant confounder to estimate the reproducibility of SB and PA data.

Methods
Study participants: From a sample of 1165 individuals aged between 40 and 70 years who had been recruited in general medical practices, job agencies, or via a health insurance company between June 2012 and December 2013, 95% gave written consent to be contacted again. Of those, a total of 401 persons were offered participation in a randomized-controlled study that aimed to assess the feasibility of a brief tailored letter intervention to increase PA and to reduce SB in leisure time. The design and participant ow were described in detail elsewhere. 18 SB, LPA, and MVPA were measured with accelerometry at baseline (n = 175) and after 12 months (n = 165, 94%). Two participants at each measurement period were excluded due to missing accelerometer data (excluded, n = 4). In addition, we analyzed data only among those who have worn the accelerometer ≥ 10 hours per day (h/day) on ≥ 5 days including at least one weekend day (excluded, n = 25). The nal sample size comprised 136 participants.
Procedure: At baseline and after 12 months, all participants underwent the following procedure: (1) cardiovascular health program including blood sample taking and standardized measurement of blood pressure, waist circumference, body weight, and height at the cardiovascular examination center of the University Medicine Greifswald; (2) self-administered assessments of socio-demographics, SB, and PA; (3) wearing an accelerometer for seven consecutive days; and (4) protocolling daily working hours over the monitoring period.
Study participants were instructed to wear the accelerometer on their right hip with an elastic band, to start the day after the cardiovascular health program in the morning after getting dressed, and to take it off during night's sleep and water activities. All participants were informed that PA would be recorded for seven days.
After baseline assessments, participants were randomized into an assessment-only group (n = 85) or an intervention group (n = 90). Additionally, for all participants, self-administered assessments of SB and PA were conducted at month 1, 3, 4, and 6 after baseline. Only individuals of the intervention group received up to three letters tailored to their self-reported SB and PA at month 1, 3, and 4. The study was conducted between February 2015 and August 2016 and was approved by the clinical ethical committee of the University Medicine Greifswald (protocol number BB 002/15a).
Measures: Accelerometer-based data were assessed using a tri-axial ActiGraph Model GT3X+ accelerometer (Pensacola, FL). The accelerometers were initialized at a sampling rate of 100 Hertz and raw data were integrated into 10-second epochs. Data from the vertical axis were used. For statistical analysis, data from the accelerometers were downloaded and processed using ActiLife software (Version 6.13.3; ActiGraph).
Time spent in SB, LPA, MVPA, and wearing the accelerometer was determined by minutes per day (min/day). Non-wear time was calculated by the Troiano algorithm, de ned as at least 60 consecutive minutes of zero activity intensity counts, with allowance for ≤ 2 minutes of counts (counts/min) between 0 and 100. To identify the time spent in different intensities of PA, we used cut points according to different intensity threshold criteria. 19 Values < 100 counts/min were determined as SB, values between 100 and 2019 counts/min as LPA, and values ≥ 2020 counts/min as MVPA. Different intensity activities (LPA and MVPA) or SB were accumulated in bouts of ≥10-minutes, respectively. Sex, age, and years of school education (< 10 years/ 10 to 11 years/ ≥ 12 years) were obtained by a selfadministrative questionnaire. In addition, study group (assessment-only group/ intervention group), time (baseline/ after 12 months), recruitment site (general practice/ job center/ health insurance), rst day of measurement (weekday/ weekend day), 20 season of data collection (winter/ spring/ summer), 21 and the average number of working hours 4 on each day the accelerometer was worn were included as covariates.
Statistical analyses: We decided to include data from both study groups as all participants received almost the same assessment procedure, the feasibility study was not powered to detect differences between assessment-only group and intervention group, and previously published data revealed that there were no differences in self-reported PA and SB between groups after 12 months. 18 SB, LPA, and accelerometer wear time were approximately normally distributed, thus untransformed values were used for analyses. To account for their right-skewed distributions, MVPA data were square root transformed. For all analyses, p-values below 0.05 were considered statistically signi cant.
Latent growth models were used to investigate AMR for both measurement periods. 22 In line with Baumann et al., 15 time spent in SB, LPA, MVPA, and wearing the accelerometer on each of the seven days of measurement was represented by seven observed indicators of these continuous latent variables (growth factors). The indicators were regressed on latent growth factors representing trajectories of outcomes over a week. 22 A maximum likelihood estimator with robust standard errors was used. The shape of the growth curves was determined by time scores de ned in the measurement model of the growth factors and matched with the observed day number of the measurement week. To specify nonlinear growth curves, an overall change function (e.g., linear, quadratic, cubic) was tted to the sample by adding quadratic and cubic slopes of time scores to the models. Rescaled Likelihood Ratio Tests were used to test whether higher order functions of time scores and free growth factor variances were required. 23 Working hours as a time-varying covariate that was speci ed to predict outcomes at the corresponding day of measurement has been taken into account for all models. Additionally, accelerometer wear time as a time-varying covariate was used in modelling SB and activity outcome variables (LPA and MVPA). Non-zero time trends in the outcomes over the days of measurement would imply reactivity. In the models, the slope factor was freely estimated if appropriate and treated as a reactivity indicator re ecting the individual average change in outcome over time. Therefore, the factor scores of outcomes were saved and included as a reactivity indicator in further analysis. Statistical analyses were performed using Mplus version 7.316. 23 For each outcome, the average of the 7 days of measurement was calculated. Two-level (individual and time) mixed-effects linear regression analyses were performed to assess changes in accelerometer-based outcomes from baseline to 12 months apart, including a random intercept for subjects. All regression models were adjusted for sex, age, education, study group, time, recruitment site, rst day of measurement, and season of data collection. In addition, we added the individual average value of accelerometer wear time, the reactivity indicator of the respective SB and PA outcome, and a combination of these factors as potential covariates step-by-step.
We used intraclass correlation (ICC) coe cients to decide which model for each outcome was most appropriate. The ICC is a measure of reproducibility of replicate measures from the same subject. 24 The ICC coe cient is classi ed as follows: less than 0.4 indicates poor, between 0.4 and 0.75 fair to good, and 0.75 or more excellent reproducibility. 24 To illustrate the agreement between both measurement periods and to estimate the limits of agreement interval (95% Con dence Interval, CI), Bland Altman plots were applied. Statistical analyses were performed using Stata/ SE version 14.2. 25

Results
Sample characteristics: In our sample, the mean age was 54.6 years and 65.4% were women. The majority of the participants attended school 10 to 11 years (70.4%). Table 1 provide data on time spent in SB, LPA, and MVPA and wearing the accelerometer from baseline to 12 months. Accelerometer measurement reactivity: As shown in Figure 1, at baseline and after 12 months, participants increased time spent in SB by 2.4 min/day (p = 0.033) and by 3.8 min/day (p = 0.001) and reduced time spent in LPA by 2.0 min/day (p= 0.033) and by 3. The Bland Altman plots visualize the agreement between the two measurement periods, as a function of the mean of these two measurement periods (additional le 3; Figure S1). The plots showed that baseline values in all outcome variables were higher than after 12 months. The mean difference (baseline -after 12 months) of both measurements of SB was 8.8 min/day (SD = 87.9 min/day), LPA was 3.5 min/day (SD = 46.9 min/day), MVPA was 3.7 min/day (SD = 17.3 min/day), and accelerometer wear time was 16.0 min/day (SD = 95.1 min/day).
Using the most appropriate regression model for each outcome, neither the time spent in SB, LPA, MVPA, nor accelerometer wear time signi cantly declined or increased over time ( Table 2). In all SB and PA regression models, the average value of accelerometer wear time was a signi cant confounder.

Discussion
The present study has two main ndings. First, there was a signi cant linear trend in SB and LPA time series as an indicator of AMR for baseline and 12 months apart, whereas MVPA does not seem to be affected by AMR. Further, the systematic changes within accelerometer wear time differed between the two measurement periods. Second, our results showed that the time spent in SB, LPA, and MVPA and wearing the accelerometer are fairly stable between the two measurement periods. AMR operationalized by a reactivity indicator could not be identi ed as a relevant confounder for the estimation of the reproducibility of SB and PA data.
In line with previous literature, the results of the present study indicate that persons change SB and PA in the presence of an accelerometer. 8,11,[13][14][15] This study adds to the literature by showing that AMR differentially in uences SB and PA in two measurement periods depending on the intensity level of PA. As shown by Baumann and colleagues, individuals appeared to replace SB with LPA. 15 LPA includes standing and walking at a light pace, and therefore SB and LPA are highly correlated. 26,27 In addition, our study shows that the trend of both time series seems to be the same over the two measurement periods. Therefore, it can be assumed that AMR has a similar impact on SB and LPA over time. The best advice for reducing bias due to AMR is to consider this in the planning of the study. 28 In case of interventions, measurement periods in terms of duration and frequency as well as number of methods used to assess SB and PA should be balanced to identify potential intervention effects.
Although the effect sizes of the changes in SB (baseline: d = 0.30; after 12 months: d = 0.50) and LPA (baseline: d = 0.29; after 12 months: d = 0.54) over one measurement period were small to medium, the validity of habitual levels of accelerometer-assessed SB and LPA could be biased due to the background noise of human-related sources of measurement errors. AMR could interact with interventions by masking their potential effects, particularly in behavioral intervention studies that typically expect smallto-medium effect sizes. In addition, there is no consensus in the literature about how many days are relevant to achieve a reliable measurement of habitual levels of SB and PA 29 and over what period of time reactivity seems to last. This leads to different recommendations of monitoring regimes including different length of familiarization periods (two days to one week). 7 Therefore, to what extent AMR may result in an over-or underestimation of SB or LPA remains to be investigated. Furthermore, our results showed that MVPA seemed to be less altered by AMR than SB and LPA. This could be because of MVPA typically requires more planning and is likely to be more structured than lighter physical activities. However, it has been argued that MVPA is less predictable on a day-to-day basis and requires longer monitoring periods to determine reproducible habitual behavior. 3,4 In line with our results, a study that examined the reproducibility in accelerometer-assessed SB and PA reported the highest ICC values for SB. 17 It should be noted that ICC coe cients are restricted by the sample in which it was collected, because the magnitudes of intra-and inter-individual variability in SB and PA depends on the characteristics of the study sample. 30 Although the results reported in this study indicate that a 7-day accelerometry monitoring seems to be a reproducible measure of SB and PA, Bland Altman plots showed that there is a high intra-individual variability in SB and PA data. Therefore, the ndings should be interpreted with caution and future studies are required to verify the ndings of the present study using a larger sample of adults.
Our ndings on accelerometer wear time highlight the importance of this factor. The value for the ICC coe cient of accelerometer wear time was the lowest of all outcomes. In terms of AMR, the systematic changes in time of wearing the accelerometer differed in the magnitude between the repeated measurements. In this regard, providing clear instructions to constantly wear the device to provide valid data at each measurement time point seems to be of great importance for the compliance of participants over time. 31 Accelerometer wear time was the only relevant predictor in all SB and PA regression models, whereas AMR operationalized by a reactivity indicator could not be identi ed as a relevant confounder for the reproducibility of PA and SB data. In line with other studies, this indicates that accelerometer wear time should be considered as a crucial confounder in data analysis of accelerometer data. 30,32 To reduce AMR and, on the other hand, increase the precision of the measurement, a longer measurement period may be reasonable. Although a longer measurement period might improve reproducibility, the burden for study participants and study feasibility should be considered because it in uences the response rate and compliance. 30 Some limitations have to be discussed. First, generalizability of our results may be compromised due to selection bias. The proportion of individuals who declined to participate was 44%. Non-participation could reduce study sample representativeness, 33 in the way that individuals who are more interested in their health or motivated to change their behavior are also more likely to participate. 34 In line with this nding, our sample seems to be compliant with an average accelerometer wear time of ≥ 14 h/day. As reactivity may be most pronounced in people with a high motivation to change their behavior, AMR may have been overestimated in our study. Second, we used hip-worn accelerometers that cannot differentiate between sitting and standing still because movement is determined by acceleration rather than body posture. 35 Thus, our ndings regarding SB and LPA can give a distorted picture on AMR. Third, the inclusion of the intervention and assessment-only group in the analysis might have caused additional variation to the data, as the intervention group may (differentially) change their habitual SB and PA level over time. Unfortunately, our analyses of AMR separated by assessment-only group and intervention group revealed no representative results due to their small sample sizes. Therefore, future studies are required to examine how intervention and control groups differ with regard to AMR using a larger sample. Finally, using the slope factor as a reactivity indicator to operationalize AMR is just one of several other ways to account for AMR for estimating the reproducibility of SB and PA data.

Conclusions
In conclusion, the results of the present study show that AMR differentially in uences SB and PA in two measurement periods. Although a 7-day accelerometry monitoring seems to be a reproducible measure of SB and PA, our ndings highlight the importance of accelerometer wear time as a crucial confounder when using accelerometer in monitoring SB and PA, planning interventions, and analyzing SB and PA data.

Declarations
This study was funded by the Federal Ministry of Education and Research as part of the German Centre of Cardiovascular Research, DZHK (grant no. 81/Z540100152).The DZHK had no direct role in the development of methodology, the acquisition, analysis, and interpretation of data or in writing the manuscript.

Availability of data and materials
The datasets generated and/or analyzed during the current study are not publicly available due to restrictions associated with anonymity of participants but are available from the corresponding author on reasonable request. Researchers requesting the data will be required to sign a contract ensuring data usage in compliance with the statement given in the informed consent procedure and with the German data protection law, that the data will not be transferred to others, and that the data will be deleted after the intended analysis has been completed. To comply with the statement given in the informed consent, the use of the data is restricted to research related to cardiovascular-research questions. We cannot ensure to prevent use for other purposes when uploading the data for public access.
Authors' contributions UJ and SU contributed to the conception or design of the study. AU, SB, LV, UJ, and SU contributed to the acquisition, analysis, or interpretation of data for the work. AU drafted the manuscript. All authors critically revised the manuscript and gave nal approval.
Ethics approval and consent to participate All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
This study was approved by the clinical ethical committee of the University Medicine Greifswald, Germany (protocol number BB 002/15a) and retrospectively registered to the ClinicalTrials.gov Protocol Registration and Results System (Clinical trial registration number: NCT02990039). Informed written consent was obtained from all individual participants included in the study.

Consent for publication
Not applicable.