Multi-level validation of the German physical activity barriers self-efficacy scale in a sample of female sixth-graders

Background The physical activity behaviour of the majority of children and adolescents is insufficient. Self-efficacy is regarded as one of the most important determinants able to enhance physical activity. The purpose of this study is to validate the German version of the physical activity barriers self-efficacy scale by means of a multi-level approach. Factorial validity, internal consistency and criterion validity were examined for the individual and the class level. Methods The final sample comprised 454 female sixth-graders of 33 classes. The original 8-item physical activity barriers self-efficacy scale was translated by a committee approach and pilot tested within the pretest procedure. To examine the factorial validity of the translated scale, a multi-level confirmatory factor analysis was conducted with the lavaan package in R. Internal consistency was estimated with the alpha function of the psych package. Criterion validity was examined by correlating self-efficacy with moderate-to-vigorous physical activity (MVPA) assessed with accelerometers. Results In contrast to previous validation studies, two-dimensional models fit the data better than unidimensional models. The best overall fit exhibited a 2x2-model, with two factors both on the individual and the class level (χ2 = 65.13, CFI = .985, TLI = .977, RMSEA = .046, SRMR = .033). The latent factors correlated highly on both levels (r = .87; r = .72). Every item loaded significantly on its respective factor on both levels. Internal consistency for the full scale and the first subscale was good on the individual level and excellent on the class level. For the second subscale, Cronbach’s alpha was low on level 1 and excellent on level 2. Weak relations between self-efficacy values and MVPA were found for level 1, strong associations were found for level 2. Conclusions The validation speaks in favour of a two-dimensional scale measuring not only actual self-efficacy but also support from family and friends. Furthermore, the results argue for the relevance of the multi-level approach which is able to differentiate between self-efficacy on the individual level and collective efficacy on the class level. self-efficacy These findings could explain why

mellitus, cancer or cardiovascular diseases, and lowers the risk of premature death (1)(2)(3)(4). The World Health Organization (WHO) recommends children and youths aged five to seventeen years to accumulate at least 60 minutes of moderate-to-vigorous physical activity (MVPA) per day, with MVPA comprising every kind of activity which needs at least the amount of energy spent during ordinary walking (5). There are two reasons why it is important that children and adolescents fulfil the PA recommendation. One would be the positive short-and middle-term influences on their health and well-being (1,(6)(7)(8). Secondly, a tracking effect describes the role of adolescents' PA as a significant predictor of PA in adulthood: The more active a person is in adolescence, the higher the probability of an active lifestyle in adulthood (7,9).
According to a questionnaire-based study, only 26% of children and adolescents in Germany aged between three and seventeen years reach daily 60 minutes of MVPA. Furthermore, less girls than boys (22.4% vs. 29%) fulfil the recommendation (10). In addition, PA levels in this population decline with increasing age (10,11). A systematic review of Van Hecke et al. (12) supports the effects of gender and age. Although not even the most popular device-based approaches like accelerometry offer perfectly reliable PA data (12)(13)(14), a vast majority of studies indicates that PA behaviour in adolescence does not comply with the respective recommendation (15). Moreover, the WHO recommendation is merely regarded as a minimum value. Higher MVPA levels are associated with additional health benefits (16). Therefore, in any case it is worthwhile to promote PA from an early age.
At this point, the question arises which determinants should be focused on to enhance youth's PA behaviour. Ecological models suggest that PA behaviour is affected by several interacting levels of influence ranging from policy variables, like investments in public recreation facilities, to intrapersonal variables, like psychological constructs (17). Among these psychological constructs, self-efficacy concerning PA is of great importance. In a review of reviews by Bauman and colleagues (18) self-efficacy was the only psychological factor consistently identified as a positive correlate and determinant of PA in children and adolescents. This finding was confirmed by an umbrella systematic review specifically focussing on psychological constructs (19). Yet another systematic review (20) focused on the PA-related age effect and indicated that self-efficacy was one of very few constructs able to diminish the decline in PA between the age of ten and 18. Furthermore, two systematic reviews (21,22) analysing intervention studies identified PA self-efficacy as the most promising mediator to increase PA.
The manifestation of self-efficacy, however, does not only occur on the individual level but also on the group level. Bandura (23) defined the phenomenon of collective efficacy as "a group's shared belief in its conjoint capabilities to organize and execute the courses of action required to produce given levels of attainment". Although a group's collective efficacy cannot simply be seen as the sum of the members' individual self-efficacy perceptions (24), self-efficacy has been found to be a significant predictor of individually perceived collective efficacy in sports contexts (25,26). Like individual selfefficacy (e.g., 19), collective efficacy is also related to performance. A meta-analysis found a significant positive relation between collective efficacy and performance in organizational settings including sports (27). Furthermore, collective efficacy also affects individual performances and attitudes. In two experiments manipulating the individual perception of collective efficacy, the participants of the low collective efficacy condition set lower performance goals, showed lower commitment and put less effort in a cycling task than the participants perceiving a high collective efficacy (28,29). Thus, individual self-efficacy and collective efficacy are intertwined as individual self-efficacy acts as a predictor of collective efficacy (25,26) and collective efficacy again influences the individual (28). This point implies specific consequences for analysing self-efficacy and collective efficacy perceptions of individuals nested in groups, like athletes in teams or students in classes.
Under these circumstances, the assumption that individual perceptions are independent of one another cannot longer be maintained. Instead, the use of multi-level modeling is strongly recommended (24,30).
Due to its high relevance, self-efficacy has been extensively examined in the field of PA. Over time, however, the definitions and the respective measures of youth PA self-efficacy have become more and more heterogeneous. Therefore, Voskuil and Robbins (31) conducted a concept analysis regarding the defining attributes, antecedents and consequences of the different conceptualisations.
Eventually, they defined youth PA self-efficacy "as a youth's belief in his/her capability to participate in PA and to choose PA despite existing barriers" (31). The conceptualization of barriers self-efficacy regarding PA by Dishman and colleagues (32,33) considers the two main points of this definition by addressing both the self-perceived confidence in the capability to be physically active as well as the recognition of barriers to PA (31).
To date, no instruments exist which are specifically constructed and appropriately validated to examine PA self-efficacy of children and adolescents in secondary school in Germany. Questionnaires specifically designed for children are needed, especially regarding the wording of items. Twelve-yearolds produce a response quality worse than that of youths aged fourteen (34). Scott (35) even argues that adolescents cannot answer properly to adult items before the age of sixteen.
Therefore, the purpose of this study is to validate a German version of the barriers self-efficacy scale for PA (33) in a sample of sixth-graders. The validation is conducted in accordance with the multi-level approach described by Huang (36). Factor structure and scale dimensionality are analysed by means of a multi-level confirmatory factor analysis (MCFA). Internal consistency is also estimated for both the individual and group level, respectively. Furthermore, criterion validity is tested by examining the relation of PA barriers self-efficacy and actual PA behaviour on both levels.

Methods Participants
The sample included 507 female sixth-graders recruited from 33 classes of fifteen secondary schools in Munich. The participants were part of the CReActivity project, a randomized controlled trial aiming to promote PA of female adolescents (37). Mean age was 11.61 years (SD = . 55

Measures Barriers self-efficacy
The barriers self-efficacy scale was used to assess the girls' perceived self-efficacy to be physically active (33). The scale contains eight items. The original items were validated in samples of sixth-and eighth-grade girls. Confirmatory factor analyses supported a unidimensional model (32,33,39).
Participants responded on a five point Likert-type scale ranging from 1 ("Disagree a lot") to 5 ("Agree a lot"). The scale validated here was translated into German by means of the committee approach and was pilot tested with the pretest procedure (40,41).

Physical activity
To assess leisure time MVPA, participants wore accelerometers (ActiGraph GT3X -wGT3X-BT) for seven consecutive days except during water-based activities. The device was placed on the right hip.
Sampling rate was set to 30 Hz. Participants had to wear it on weekdays starting at the latest on their way to school until 9 pm or until they went to bed. On weekend days, the students had to put it on as soon as they woke up until 9 pm or until they went to bed.

Procedures
Several weeks before the beginning of the data assessments, students and their parents were informed in writing about the purpose and the procedure of the assessment. Students did not participate if they had not provided a written consent form before.
Data assessments took place at the beginning of a physical education (PE) lesson. Codes were used to ensure the anonymity of the participants. Before handing out the accelerometers, the assessment team explained how to put them on. At least 25% of the students of each class received a sheet with information regarding the accelerometers enabling them to serve as contact persons for their classmates. After the students had put on the accelerometers correctly, they filled out the questionnaire. The actual PE lesson did not start until the last student had completed the questionnaire.

Data analysis
Multi-level validation of the barriers self-efficacy for PA scale As the sample examined in this study provides clustered data, the validation is based on the multilevel approach by Huang (36). Ignoring the clustered nature of the data can lead to wrong parameter estimates, standard errors and model fits. It is recommended to account for multilevel data even if intracluster correlations (ICC) of the single manifest variables are small (42). In nested data, factor structures might not be the same for each level (36). MCFA provides the opportunity to examine individual-and group-level data simultaneously. To this end, the total population covariance matrix is split into a pooled within-group covariance matrix and a between-group covariance matrix. Thereby, both within-and between-group effects may be estimated at the same time. Huang (36) offers an R syntax to be used with the lavaan package (43) and a function for generating the required matrices based on the five MCFA steps outlined by Hox (44, Chap. 14).
In step 1, a single-level factor analysis is performed using only the pooled within-group covariance matrix. In step 2, the null model, which assumes the factor structure of step 1 for both levels, is fitted.
In this step, both the pooled within-and the between-group covariance matrices are used as input.
Equality constraints for the two levels are applied, meaning that factor loadings, variances and covariances for every manifest variable and latent factor are assumed to be the same for the two levels. In step 3, new group-level latent variables are introduced to estimate the variance attributed to the groups. This step is referred to as the independence model since the newly introduced grouplevel variables are not allowed to covary. This constraint is eliminated in step 4, the so-called saturated model. All degrees of freedom at the between-group level are now used, making it a fully saturated model. Finally, in step 5, the model that is actually hypothesized is specified. At least one overall general factor is added for the between-group level which is defined responsible for the correlation of the latent group-level factors (36). For every model, small negative residual variances on the class level are fixed to zero to allow the model to fully converge. This common practice is particularly required when the number of groups on level 2 is small and ICCs are close to zero (44).
To evaluate model fit, several fit indices were considered (45): the χ2-likelihood ratio statistic, the comparative fit index (CFI), the Tucker-Lewis index (TLI), the root mean square error of approximation (RMSEA) and the standardized root mean square residual (SRMR). As the χ2-goodness of fit test tends to reject reasonably fitting models when applied to data of large samples, a variety of fit indices was used to estimate model fit (45). Whereas CFI and TLI values greater than .95 indicate a good model fit, values less than or equal to .08 suggest a good model fit when RMSEA and SRMR are considered (45).
Furthermore, as fit tends to improve by including more variables in the model, parsimony is another criterion taken into account when deciding for a preferred model. Akaike's information criterion (AIC) was considered in model comparison as it not only evaluates model fit but also penalizes an increasing amount of estimated parameters (46). Lower AIC values indicate better model fit.
Eventually, it is aimed for a model explaining as much variance as possible with as few variables as necessary. Therefore, the optimal combination of model fit and parsimony is sought (47). Pearson r is indicated for both the pooled within-group correlation and the between-group correlation.

Physical activity
During PA data download the vector magnitude counts were summed over 1-second epochs (10second epochs for GT3X because of lower memory and battery capacities). The low frequency extension filter was not used. Wear-time validation was conducted with the algorithm by Choi, Liu, Matthews and Buchowski (50). A participant's PA data was considered as valid when data of at least three weekdays and one weekend day were available with at least eight hours of wear time being required for a valid day. The wear-time validated PA data was analysed utilizing the cut points by Hänggi, Phillips and Rowlands (51) to eventually calculate the average duration of MVPA per day for each participant. The cut points by Hänggi et al. (51) were chosen because they provide a precise assessment and were validated by applying the same data sampling and processing criteria as the ones chosen for this study (52).

Results
Of the 507 participants originally included in the sample, 53 had missing values in at least one item of the self-efficacy scale. To avoid bias it was refrained from replacing missing values. The participants excluded from the analysis did not differ significantly from the valid sample regarding BMI, SES, selfefficacy and MVPA. Finally, 454 sixth-graders built the final sample.
The descriptive statistics of the eight items of the barriers self-efficacy scale are presented in Table 1.
For the single-level one-factor model an acceptable fit was found (model A in Table 2 In the last step of the algorithm outlined by Hox (44, Chap. 14) and Huang (36) model B was obtained, see Table 2. For this model, one overall general factor is added for the class level (1 × 1-model).
Model B contains twice as many degrees of freedom as the single-level model A, which led to an increase of the χ2 and AIC value. However, according to the CFI, TLI and RMSEA indices model fit improved. On level 1 all factor loadings were significant whereas on level 2 three out of eight items exhibited significant loadings (Table 3). Unlike the approach proclaimed by Huang (36), model C (1 × 1-model) was specified without a nested structure, meaning that the measurement model for level 1 is not included in level 2. Except for SRMR, the values in every fit index got slightly worse. Now, however, each factor loading on level 2 was significant (see Table 3). For the models D and E (2 × 2-models), a second latent factor was  Aiming at a well-fitting and at the same time parsimonious solution, models with two factors on level 1 and only one factor on level 2 were specified as well (

Discussion
The guidelines for PA (5) are only fulfilled by a minority of children, adolescents and adults (e.g., 12,15). As individual PA behaviour is often sustained from adolescence to adulthood (e.g., 9), interventions trying to enhance PA of children and adolescents are of great importance. To improve young people's PA behaviour, individual self-efficacy is one of the most important determinants to focus on (e.g., 19,20). The barriers self-efficacy scale (33) assesses the individual self-efficacy regarding PA of adolescents and implies the findings of the concept analysis of Voskuil and Robbins (31).
In this study a German version of the barriers self-efficacy scale was validated in terms of its factorial validity, internal consistency and criterion validity. Self-efficacy does not only act on the individual level but also on a group level, in this case named collective efficacy (23). Additionally, individual selfefficacy and collective efficacy interact (e.g., 25,28). Therefore, and since the scale was validated with clustered data, analysis was conducted based on a multi-level framework (36). This way, a mismatch between the constitution of self-efficacy and its assessment and analysis was circumvented. The barriers self-efficacy scale can be applied to measure the construct both on the individual and the group level at the same time by applying the summary index model (53). It suggests that the aggregated variable on the group level can be the sum or the average of a variable assessed at the individual level.
Self-efficacy of our sample was comparable to the self-efficacy of the sample of sixth-graders used to validate the original scale by Dishman and colleagues (33) in terms of the means (3.61 vs. 3.74, see Table 1). Standard deviation was almost identical (0.83 vs. 0.79), kurtosis of the items was similar (-1.10 to 0.03 vs. -1.05 to 0.65).
The fit of the single-level one-factor model A (Table 2) was acceptable which built the legitimation for conducting the subsequent steps of the MCFA. The small ICCs of the items suggested no significant variance between the classes. The fits of the null model, independence model and saturated model did not allow for a clear-cut inference about a statistically significant group-level variance (36,44).
Concerning the fit indices which are less sensitive to the number of parameters to be estimated, the fits of the multi-level models B and C were better than the one of the single-level model A (see Table   2), which justifies conducting a MCFA. The introduction of a second factor on both levels finally led to the best fit of model E in which it was refrained from specifying a nested structure. The correlation of the two latent factors was large on both levels. Aiming at a preferably parsimonious solution, this would speak in favour of a model with two factors on level 1 and only one factor on level 2, as they fit only slightly worse than model E. However, the decision for model E is justifiable for several reasons.
The wording of the items 2 and 5 addresses the family and peers of the participant. The answers to these items mainly depend on circumstances which cannot be fully controlled by an adolescent. If the parents both work full time and, on top of that, are not interested in being physically active, the child lacks the means to change these circumstances. Similarly, if the best friend does not like to be active and is not reachable within a manageable distance for a child, chances of regularly engaging in PA together are low. Thus, an actually self-efficacious adolescent can disagree with these items while agreeing with the remaining items which refer to more personally controllable aspects and attitudes.
The fact that items 2 and 5 show the lowest loadings in the single-level model A and exhibit comparatively low correlations with the remaining items indicate that this scenario occurred in a considerable number of cases which finally argues for a second factor on the individual level.
The sample of this study contained only adolescents attending school in the city of Munich. Living in an urban area with good infrastructure, they are provided with relatively good possibilities to visit their friends on their own by foot, bike or public transport. In a sample including students both from urban and rural areas, the possibilities of visiting the best friend on one's own might differ to a greater extent between classes. In this case, between-group variance specifically concerning item 5 would increase. This speaks in favour of a second factor on level 2 as well, which makes a multi-level approach and finally a 2 × 2-solution even more appropriate.
These content-related considerations need to be corroborated by empirical evidence. Hence, it is of great importance that the parameter estimates of the MCFA support a solution with two factors on both levels. Model E offers the combination of the best fit and the highest loadings with all of them being statistically significant and is thus deemed the best solution. The striving for a more parsimonious solution should be seen critically as it involves a loss of precision and well-interpretable information. On both levels two latent factors can be identified and -although being highly correlated -differentiated in terms of their meaning.
Six items refer to the actual self-efficacy of the adolescent. These items assess the core of the construct. The items 2 and 5 relate to the personal environment regarding support for being physically active. This interpretation holds for both levels. Individuals as well as school classes can differ in their (average) self-efficacy and in their (average) social resources regarding support of PA.
The high correlations of the two latent factors on both levels (r ≥ 0.72) reflect their interdependence.
An adolescent's self-efficacy for being physically active is associated with the attitudes and behaviour of parents and peers. The concept of vicarious experience (23) suggests that if a person observes another person performing successfully, it can enhance the confidence in the own ability to succeed in the same task. This can lead to the effect that an adolescent's PA behaviour influences his/her best friend and vice versa. Furthermore, the attraction paradigm (54) proclaims that perceived similarity to a peer is a major factor deciding whether a relationship turns into a close friendship or not. Taken together, it can be assumed that close friends often think the same way about being physically active as their similarity led to their friendship in the first place (54) and vicarious experiences help in further assimilating to each other (23). This could explain the correlation of item 5 with the six items assessing the actual PA self-efficacy.
Speaking of vicarious experience it is to mention that, "the greater the assumed similarity, the more persuasive are the models' successes and failures" (23). Since adolescents normally perceive their parents as being less similar to them as their friends, the parents' modeling of PA is not sufficient to enhance their children's PA self-efficacy. Instead, parental PA support finally has a significant positive effect (55). For sixth-graders, particularly parents' emotional and instrumental social support have an effect on the adolescents' PA self-efficacy (56). These findings could explain why responses to item 2 correlate highly with actual self-efficacy.
Finally, combining the possible scenarios described above with the empirical results regarding factor loadings and the correlations of items and latent factors, a strong positive association between actual PA self-efficacy and support of parents and peers can be assumed for many, but not for all adolescents. Thus, although previous validations of the barriers self-efficacy scale supported a unidimensional model (32,33,39), the recognition of a conceptually and statistically distinguishable factor assessing PA-related support by parents and peers is regarded advisable.
Reliability was estimated for the individual and the class level separately (36). Cronbach's alpha for the eight-item scale was good on level 1 and excellent on level 2 (57). Cronbach's alpha is positively associated with the number of items (58). Alpha values for the shorter six-item subscale representing actual PA self-efficacy yet were not diminished, which speaks for an even higher internal consistency of this sub-group of items compared to the complete scale. Cronbach's alpha for the two-item support factor was low on level 1 and excellent on level 2. Thus, the association of support from family and peers becomes less ambiguous when the nesting of students in classes is considered. Higher reliability values on the group level were expected since reliability tends to increase and measurement error tends to decrease when measures are aggregated across students within the same classes (59).
Likewise, the use of aggregated measures on the class level normally affects factor loadings and correlations to be higher on this level (59). Whereas in the final model E, loadings for most items are Whereas the eight-item scale as well as the items of the first factor exhibited good to excellent Cronbach's alpha values, the internal consistency of the second factor is poor on the individual level.
While this could have affected the correlation between the two factors as well as the criterion validity (61), it should also be discussed, if the second factor items should be arranged on one factor, since the answers to these items depend on two different groups of people (family and friends).
Additionally, the findings should be verified in a more diverse sample comprising girls and boys of different age and from both rural and urban background. Further research about the construct or, more specifically, about the barriers self-efficacy scale (33) should include more classes on the group level and also more students per class.

Additional files
Additional file 1: file name: sample; format: .docx; description: info regarding how the sample was recruited, how representative the sample was of the target group, how the analysed sample differed from the recruited sample and how any missing data were handled Additional file 2: file name: data set; format: .sav; description: minimal dataset necessary to replicate the analysis

Ethics approval and consent to participate
The study was approved by the ethics commission of the Technical University of Munich (155/16 S) and the Ministry of culture and education of the state of Bavaria in Germany.
Students did not participate if they had not provided a consent form signed by them and their parents before.

Consent for publication
Not applicable.

Availability of data and materials
All data analysed during this study are included in the supplementary information files of this article.

Competing interests
The authors declare that they have no competing interests.

Funding
The study is funded by the German Research Foundation (DE 2680/3-1). The researchers are independent of the funders who have no influence on study design, conduct, analyses, or interpretation of the data, the decision to submit the results or the preparation of the manuscript.

Authors' contributions
JB collected, analysed and interpreted the data and drafted the manuscript. DJS also collected data and contributed to the data analysis. SH lent substantial support in the analysis and interpretation of the data. YD supervised the cooperation of the authors and revised the manuscript critically for important intellectual content. All authors read and approved the final manuscript.