Participants
Sixty-six healthy volunteers (mean age = 21.7 ± 1.60) were recruited from among the students of Hiroshima University. They attended a research briefing and provided written informed consent for participation. Accelerometers were distributed on the first day of the study, and physical activity was measured continuously for 165 hours from midnight the next day to 21:00 on the eight day. After the completion of the measurement, the experimenter recovered the device and extracted the recorded data. Participants’ personal information was protected because the device recorded only 3-axis acceleration and did not have any information about their lives, such as location (where they stayed, or how far they traveled).
Physical activity
Participants were asked to wear the Device Arm Triaxial Accelerometry system (UW-301BT Life Log, Hitachi Ltd, Tokyo) on their non-dominant wrist for seven days, except while bathing. This device can record 3-axis acceleration with a 20 Hz sampling rate [14]. The recoding signal was integrated to calculate the intensity of physical activity at each minute, referred to as metabolic equivalents (METs), via conversion algorithms [15, 16] provided in the device.
Since one MET is equivalent to the energy cost of sitting quietly, we were able to assume zero MET as non-wearing time such as bathing. We defined a missing period as any non-wearing time of more than 30 consecutive minutes. Since 22 participants had the defined missing period, 44 (mean age = 21.5 ± 1.02; females = 18; BMI = 20.9 ± 2.94) were included in the analysis excluding them. And data for non-wearing time of less than 30 minutes were substituted with the mean of the previous 15 minutes. The complemented data were totaled every hour, leading to a total of 165 time points (since the mounting start time was uneven on the first day, midnight of the second day was the measurement start time). The total intensity of hourly physical activity was entered into the SARIMA model.
Questionnaires
Japanese version of Beck's Depression Inventory-II (BDI-II)
The BDI-II is a widely used scale for assessing depressive symptoms, and consists of 21 self-report items. The items are scored using a 4-point scale. The Japanese version of the scale has demonstrated reliability and validity [17].
Japanese version of the Behavioral Activation for Depression Scale (BADS)
The BADS [18] measures depression-related behavior patterns based on behavioral theories of depression. It consists of 25 items rated on a 7-point scale (0: Not at all to 6: Completely). The four subscales of the BADS are Activation (AC: 7 items) representing goal-directed activation and completion of scheduled activities; Avoidance/Rumination (AR: 8 items) representing avoidance of negative aversive states and engaging in rumination; Work/School Impairment (WS: 5 items); and Social Impairment (SI: 5 items). The Japanese version of the BADS has demonstrated reliability and validity [19].
SARIMA model
A SARIMA model is formed by including seasonal elements in the ARIMA models [20]. The terms for non-seasonal and seasonal elements in a SARIMA model are as follows: SARIMA(p, d, q, P, D, Q, m), where, p, d, and q are non-seasonal delays of AR type, non-seasonal delays of MA type, and non-seasonal integration order, respectively. AR (p) means that the past values of itself until t-p time points are included as predictor variables of value at t point, and MA (q) means that the noise values until t-q time points are also included as predictor variables of value at t point. Then, the dependent variable is usually integrated for stationary form by a sequence of differences from the value at t-d time. The SARIMA model includes the seasonal element as a hyper-parameter (P, D, Q, m), where, P, D, Q, and m are seasonal delays of AR type, seasonal delays of MA type, seasonal integration order, and length of seasonality cycle, respectively. Since people are usually active in 24-hour cycles, we fixed the parameter m to 24 (24 hours), that is SARIMA (p, d, q, P, D, Q, 24) model, to obtain time-series information about daily habitual physical activity.
Grid search for an individual optimal model
We performed a grid search of the model order values, (p, d, q, P, D, Q, 24), for all parameter combinations for each individual. In order to narrow down the range of candidate parameters, we asked participants to report their average daily physical activity time (“On average, how much time in total did you usually spend doing physical activities on one day?”). Since the mean value was 2.3 hours (mode = 2.0 hours), the range of influence of past physical activity was limited to 0-3 hours. Hence, there were four candidate values for the hyper-parameters (p, q) between 0 and 3, respectively. The seasonal hyper-parameter determines the influence of the 24-hour cycle. To specify whether the previous day's cycle should be included in the model, the candidate hyper-parameters (P, Q) were denoted by 0 or 1. The hyper-parameters (d, D) that specify the range of the difference were set to 1 in order to reduce the calculation cost. We tried to include the distinction between weekdays and holidays as an exogenous regressor in the model, but this had no effect and did not change the goodness of fit of the model. Therefore, we did not include a regressor for the weekend in the model, but estimated it using fewer explanatory variables.
Finally, the number of candidate combinations for hyper-parameters of SARIMA (p, d, q, P, D, Q, 24) was 64 (4x1x4x2x1x2x1). After fitting all candidate models to the data, a model with the lowest Akaike's information criterion (AIC) was determined as the optimal model a paticipant.
Interpretability of model parameters
From the individuals' optimal model, we obtained the weight estimates of each term, AR(p), MA(q), sAR(P) and sMA(Q), as indicators that reflected the time-series information of physical activity. Our SARIMA models had fixed the seasonal/non-seasonal difference to one (i.e. d = D = 1). Therefore, each weight estimate of the parameters explained the change in physical activity intensity on a daily or hourly basis.
Two seasonal parameters, sAR and sMA, may reflect daily habituation. When P = 0, sAR has no weight, which reflected that “changes” in activity are not affected by the previous day's activity. If the hyper-parameter P = 1, the sAR(1) weights reflected that daily physical activity varied depending on the previous day's physical activity. Therefore, unlike P = 0, the physical activity pattern fluctuated (increased or decreased) from day to day, which indicated unstable habitual behavior (irregularity). The weights of sMA(1) reflected the degree to which the change in physical activity from the previous day could be explained by the residual of the predicted value due to the past physical activity. When this weight value was high, the increase in activity from the previous day was proportional to the change in the previous day, but it depended on the non-linear increase in the previous day. Therefore, the parameter sMA also indicated the unstable physical activity patterns from day to day. The non-seasonal parameters indicated whether activity in the closer past (here, 0 to 3 hours ago) influenced the change in one-hour interval.
Statistical analysis
We performed regression analyses for the scores of each questionnaire using the weights of the parameters in the SARIMA model as explanatory variables. In order to maintain analytical power, we avoided grouping or selection based on optimal models that would identify participants who did not have the order of the parameter (e.g., p = 0). Instead, we substituted the weight of the non-existent parameter as zero in the regression model. The significance threshold was set at p < 0.05. All analyses were done via the Python library “statsmodels” (ver.0.11.0).