Adapting school physical activity and health surveys for children with disabilities

Background: The physical activity (PA) and health behaviours among pupils in special education classes and schools (SECS) is largely understudied. Results from surveys need to be comparable to data from children in general education. However, there is no systematic way to modify existing surveys to the SECS environment. Therefore, the purpose of this study is to report the process of converting PA and health surveys that can be comparable with the general population. Methods: There were a series of studies to address the purpose of this study. 1. Usability of the survey in SECS, 2. Pilot study data collection, 3. Further testing with a basic version, 4. National data collection among pupils in SECS. Items from the Washington Group Child Functioning Module (CFM) were continuously developed. Detailed information from researchers who administered the surveys, results from the teachers, focus groups of pupils in the pilot studies, as well as data from the main data collection were used to guide the development of each sequential study. Results: The proposed questions were completed in 25min by pupils in SECS in study 1. Visual analogue scales were tested in study 2 for the CFM, and feedback generated from the researchers led to the development of a simplified version that was tested in study 3. In total, three versions of the survey were created and instructions were given to teachers to administer. The majority of the respondents (n=889) completed the simplified version of the survey (n=396) and the average time taken ranged from 13min to 21mins. Teachers reported positively about the administration of surveys. Conclusions: In national representative surveys, it is important to give teachers clear instructions on the allocation and study protocol for school-aged children surveys. Pupils in SECS may need help to explain the questions in the surveys and more development work is needed to investigate the effects it has on survey completion. The process of

adapted PA and health behaviour surveys developed in this project could be implemented for other countries to adopt.

Background
The health of adolescents is a top research priority because these individuals have their entire lives ahead of them. In addition, young adolescence is a time where there are many physical, mental and social changes that influence the social determinants of health (1).
Moreover, physical activity (PA) patterns and health behaviours are established for future years (2,3). Survey data can reveal much about the aspects of the lives of young adolescents, especially from the individual point of view (4). The perceptions and reported behaviours of young adolescents are vital in understanding the mechanisms for improving the health behaviours and habits (5). Studies, such as the Health Behaviour in Schoolaged Children: WHO Collaborative Cross-National study (HBSC) has carried out this type of research for over 30 years, with over 45 countries involved in producing nationally representative data. Outputs have included changes in national policy (6), involvement of youth leadership programs (7), and reporting with UNICEF (8) to name a few. However, studies like the HBSC have its own limitations in that not all pupils are represented. For example, HBSC Canada includes over 17000 children in its data collection, but does not include pupils in special schools, incarcerated youth, or home schooled children (9). Such an example, limits to the global interpretation of health of young adolescents.
Historically, when current Basic Education Act 628/1998 (10) was enacted, it did not support the implementation of inclusion in Finnish comprehensive schools as according to the act, young adolescents with special needs were placed in special education. In 2006, the Ministry of Education and Culture appointed a steering group that created a proposal for a long-term strategy for the development of special needs and inclusive education. The strategy for special needs and inclusive education emphasized the importance of the right of every young adolescent to attend the nearest mainstream school, i.e. a school where the individual would normally be assigned. In this way, pupils with special needs could create and maintain their social relationships with other peers in their neighbourhood. The strategy in pre-primary and basic education was published in 2007 (11) and put into practice in autumn 2010 leading to amendments in the Education Act 2010 (12). The supplemented act laid a basis for the new era in the implementation of inclusive policy in Finnish comprehensive schools in 2011 by introducing a new three-tiered support model, i.e. general support, intensified support, and special support for children with special educational needs (12). The new support model shifted the focus of special education to the earliest possible support for individuals to prevent the emergence and growth of problems during later years and permanent placement in special education. Since then, there has been a linear increase in the proportion of pupils needing intensified and special support, from 8% in 2010 to 16% in 2016 (13). As such, there is a demand for adapting surveys for both mainstream and special educational setting purposes.
Under article 31 of the Convention of the rights of persons with disabilities, it is important for population based statistics to be disaggregated by disability (14). Over 175 countries have ratified the convention since 2008, and Finland ratified it in 2016 (15). To be compliant with this convention, surveys like the HBSC need to be adapted so that pupils in special education are also represented in these studies. However, to directly transfer the same surveys to special education classes and schools (SECS) is not advisable. This is because often pupils in SECS have difficulties in reading (16), can experience difficulties in choosing the right options when responding to Likert scales, and may be unable to comprehend complex questions (17). Although overcoming these obstacles may be to simplify the language, making fewer response options and reduce the number of questions in the survey, adaptations to surveys do still require development and testing to test the credibility of the data.
According to survey methodologies, several stages in the cognitive processes involved in survey completion take place, including 1) Question interpretation, 2) Information retrieval, 3) Judgement formation, 4) Response formatting, and potentially 5) Response editing (18). At each of these stages, reporting errors may take place. Therefore, survey designers need to reduce the errors through stabilising the conditions of the first two stages. For example, according to Youngman (19), the use of dichotomous questions with yes or no responses can create difficulties in interpretation of the question (stage 1), but the processes in the remaining stages are less complex. However, positive bias to 'yes' responses are likely to be reported and children in the SECS environment may be more likely to be sensitive to negative stimuli (20). As such, Osgood and colleagues (21) also suggested that Likert-scale items may make stages 3 and 4 more stable, but can demand more cognitive processes than questions with dichotomous responses. This is an area that needs to be taken into consideration when the respondents have cognitive difficulties.
Another dimension to consider in the development of survey items is the inclusion of visual analogue scales (VAS). VAS has been useful in research to help respondents to identify the feelings or opinions (22), and would be beneficial to alleviate some of the challenges for survey completion among pupils in SECS.
Although there are some longitudinal studies that have included pupils that could be identified to have special educational needs, such as the 'Growing up in Ireland' (23), and the 'National Educational Panel Study' (24), few cross-sectional studies that have collected data with this population have been based on a proxy reporting for the individual (25). The direct involvement of pupils with SECS is a fundamental right of the child (26) and can be fulfilled through self-reporting surveys, yet few designs for survey collection have allow for that (27).
The process of survey development in an underrepresented population group, such as pupils with special educational needs is of urgent need. There are concerns that lack of understanding in survey instruments may lead to missing data (28), raising concerns about the validity of the data (24). It may also take up too much time for the pupils in SECS to complete (29) and systematic ways of collecting data are lacking (28). These issues can result in low comparability within and between populations (30), fewer possibilities to make statistical inferences (28), and difficulties to advance the scientific knowledge of PA and health behaviours of pupils in SECS. Therefore, in this paper, we use four studies to describe the process of adaptation from general school PA and health behaviour surveys for completion of all young adolescents including pupils in SECS.

Methods And Results
The main aims of the study were to plan and carry out data collection with the purpose to allow pupils in SECS to complete the PA and health behaviour survey. We reviewed the existing available self-report surveys and the protocols to go with the data collection and then proceeded with a series of studies that led to the adaptation, creation and implementation of three types of surveys. We used terms to help us identify the short, small and simple language survey as 'S', and the longest, largest survey as 'L', and the survey type with easy-to-read language and of normal length was labelled 'M' for middle.
We also made a distinction between the younger (Y) and older (O) version to allow for the inclusion of questions intended only for the older respondents, namely 15-year-olds.
We conducted a series of studies and collected data from pupils in SECS, who with their best of their abilities, completed self-reported surveys ( Figure 1). In many SECS, teacher assistants or other support staff are available, however these surveys were designed for self-reporting rather than a proxy. The first study was to create the survey and test its usability. This was the start of the process and later led to the M and L surveys. The second study was to carry out a pilot study to test the feasibility of carrying out the M and L surveys. From the pilot study, we further refined the survey (third study) so that more pupils in SECS would be able to participate. We identified the need for a shorter survey (S), and in the fourth study, we conducted a national data collection. These studies were guided by the areas highlighted for survey development by Coolican (31). These areas  (32). There were some items that overlap with the HBSC study. This led to too many possible questions to include in the survey for PA and health behaviours. The HBSC survey had 210 items from 121 questions. The F-SPA study had 266 items from 60 questions.
At the start of the questionnaire development, the 2018 HBSC survey consisted of three types of questions (33), international mandatory, international optional and national. We gave priority in the selection of items to national and international mandatory items, and excluded items that were not in our current study interests, for example, we removed items on intimate relationships.
A group of experts (authors and see acknowledgements) examined the merged questionnaire (SECS) and discussed the direction and themes to be covered in the questionnaire. The meetings took place guided by the principles from Coolican (31). These principles included; 1) Only ask what is needed (low number of questions); 2) Ensure the questions can be answered (appropriate question language); 3) Enable truthful answering to questions (appropriate response scales); and, 4) Reduce the number of items that cannot be refused or are unanswered (avoid difficult items). We reviewed questions one by one, and a consensus was made upon which items would be included in the usability study. There were two versions of the survey, one was for the older (O) group with additional questions on risk behaviours and health literacy and another for the younger (Y) group that had the core survey questions.
In addition, we examined the response categories for each item and made further modifications to allow for as much comparability between the items in SECS and the original survey items. For many of the original items with scales greater than a 4-point scale, the number of response options were reduced to either three-or four-point scales, depending on whether the original response categories had a mid-point. We also created visual analogue scales (VAS) at the anchors and central point. For example, we modified the child functioning model (34) by including a separate visual representation for each level of the response category to indicate severity. The four-point scale consisted of the following response options, "no difficulties", "some difficulties", "a lot of difficulties" and "cannot do". A temperature gauge was an illustration of intensity that is similar to ones seen in Finnish saunas for the first three options and a big cross for the "cannot do" item, halfway of the questionnaire), we gave the respondents the option to take a mental break from completing the survey by looking at a short comic strip. If they clicked yes, then a short cartoon strip from Calvin and Hobbs appeared. If the response was no, the respondent advanced to the next question. At question 35, (item 58), we asked the respondents to report their current mood in completing the survey.

First phase testing survey Y and O
We wanted to test the usability of the survey by examining which questions could be answered and that the pupils could complete the survey within a set time (class duration). Teachers were present at the time of data collection, however the survey was administered by the research team. The researchers handed out the survey and gave instructions encouraging full participation in the survey as well as informed teachers and teacher assistants of their role during survey completion. The adults were informed to encourage the pupils to answer for themselves and avoid giving the answer for the pupil.
After the completion of the survey, the teachers and teacher assistants provided feedback on their perceptions of how survey completion went with particular reference to questions they felt were challenging for the pupils. The researchers were available to facilitate survey completion and recording notes for items that pupils had difficulties responding to.
We used all parts of this information to inform the next development of the survey and prepare the administration guide.

Results from Study 1
The modified survey consisted of 71 (163 items) questions. There were 12 pupils who tested the paper version and another 8 pupils completed the online version. All pupils completed the survey within the duration of a class period (i.e. 45min). Data from 5 of the pupils were in the database, and the other 3 disappeared from the server. More specifically, the quickest recorded time was 12min and the longest was 36min with a median duration of 25min. None of the respondents who completed the online survey wanted to take a mental break from the survey to look at the comic. The respondents felt either happy or neutral at the halfway stage of the survey.
The researchers discussed the use of the VAS in the survey, in particular for the ones presented in the functional difficulties' items. With the help of a research assistant (see acknowledgements), it was agreed that the three-option temperature gauge and a cross for the last option ( Figure 1) were acceptable. The last option of a cross was clearer than a temperature gauge going to the very end, because, in theory, the maximum temperature can go beyond the temperature gauge, whereas, a stop sign is the ultimate signal for a limit of functions. The general feedback from the other VAS used in the study was positive, although there was no certainty about the impact on responses. As such, in the next round of testing we organised some cognitive testing of the VAS. Moreover, we made sure that pupils would be familiar with the images and utilised a well-established special education teacher resource site called Papunet (https://papunet.net).. The image bank has over 30,000 images that illustrates many daily activities, expressions and symbols for communicating with people with special educational needs. The license was a CC-BY-NC-SA.

Study 2-Pilot study to carry out the M and L surveys
Modifications for online survey Following the experiments in Study 1, we met to discuss the pilot phase of the survey development. The survey was transferred solely to an online platform called Webropol and further tests were based there. One of the main differences between paper and online surveys is the online design layout (35). We made adjustments from the paper version to block the questions in ways for students to have access to see the entire question and response options without having to adjust the displays (36). In addition, the software had automatic responsive styling feature (automatic adjustments based on computers, tablets and phones). Some schools had laptops available and others had tablets, that worked based when it was in landscape mode. In addition to the VAS, we also introduced graphics from Papunet to help describe the main action of a question or main icon to depict the characteristics of the item. For example, we used separate graphics referring to taking part in recess activities by sitting, walking, or playing sports.
We sent the surveys to the national centre for easy-to-read (ETR) language to convert and simplify the language. ETR recommended that matrix type of questions would be whole Unfortunately, the teachers' survey could only be linked back to less than a third of the pupils. Teachers experienced problems with the survey platform. At the time of data collection, there was an excessive lag time, and some data came back that it took over 10 hours to complete the survey, and others were missing. Despite contacting technical support, the server problems were so large that the technicians who host the survey platform were unable to fix it in time during the data collection period.
Although the pupils completed all the questions in the designed time, some modifications were made to improve the survey. For example, some of the battery questions were very detailed and included many items to cover the single question topic. For example, we would have liked the pupils to complete the sport enjoyment scale (37). However, it has 21 items and we decided to remove it completely.
We also reduced the number of items in certain areas of interest. For example, in the versions used for studies 1 and 2, there were questions about what type of activities the individuals do during recess time. There were two contexts, indoors and outdoors, with four types of activities, 1) sitting or standing, 2) walking, 3) taking part in sports, play or ball games, 4) taking part in organised activities with a supervisor. In this survey section, there were 8 items to respond to in 2 questions. Based on the feedback, the two contexts of indoors and outdoors were too demanding on the pupils. As such, we decided to focus solely on recess time in general in a single question without reference to indoors or outdoors.

Study 3-Further modifications and the S version Preparation for S version
Based on the feedback from study 2, it became apparent that there was a need to create a further adaptation to the survey prior to national data collection. Pupils with more challenging behaviours and intellectual impairments had difficulties to complete the survey and therefore fewer items were needed, as well as the simplification of questions.
Response items were further refined to fewer options. For example, all text was put into block capitals and the vision item of the CFM was modified with a visual graphic of an eye to highlight the part of the body and the question modified to "DO YOU SEE OK?" with dichotomous response option of "YES" and "NO". This version became the Short (S) version survey.
Testing phase for S survey The purpose of study 3 was to test the usability of the S-version of the survey among pupils with challenging behaviours and intellectual impairments with intensified and special support levels. The time it took to complete the survey were recorded through the online survey platform. In addition, we showed teachers a guide to be used to allocate the appropriate survey for their pupils, and we asked the teachers to give feedback about the guide. A convenience sample included three groups of pupils with intellectual impairments from the Central Finland area. Parents were informed of the study and gave permission for their child to complete survey within school hours. Pupils completed the surveys anonymously and voluntarily.
Feedback from teachers of these pupils were also noted by the researchers. Informal discussions also took place with teachers to gauge the appropriateness of the questions.
This was an essential step because it is likely that the completers of the S-version of the survey would need to the most amount of support in completing the survey. All surveys were completed anonymously, but the involvement of teachers and teacher assistants in facilitating the completion of the survey may compromise the anonymous nature of the survey. This is particularly a sensitive aspect for items such as risk behaviours, bullying and other mental health related items.

Results from Study 3
A total of 11 pupils with intellectual impairments completed the S-survey. According to the three-tier special education system, they were considered under the special support pillar and therefore would complete the S-survey. The teachers gave instructions to the participants in how to write in the web address of the survey into a web browser and then informed the participants to complete the survey. The teachers and researchers were available to provide support for the participants during the completion of the survey through either reading out the item to the individual or, if necessary, clarify the question and response categories. The median completion time for the S version of the survey (including the time for teacher or researcher support) was 21min with a range between 11min and 41min, and the interquartile range between 15min and 25min.
During data collection, the researchers noticed teachers helped to explain the seven-day recall physical activity item by pointing to the daily activity calendar for the week and then asking the pupil to input their own response. A common practice in special education is to use a daily activity calendar on the board to inform the pupils of what the schedule looks like for the day and the days of the week. Even with this aide, the response scale was difficult for the pupils to differentiate the sensitivity of the categories. The teachers reported some of the participants were guessing rather than stating their own preference. Therefore, the researchers replaced the PA items for three contexts; 1) being physically active during recess time, 2) out of school, and 3) on weekends. In addition, items regarding eating habits were split into two questions so that there were only two items per screen, rather than four. This led to a slight increase in the number of questions, although the number of items in the survey did not change between the pilot and the national data collection (study 4).
Teachers reported the support given to the pupils was labour-intensive and suggested audio recordings of the survey questions may be a way to overcome low levels of literacy.
Unfortunately, the survey platform did not support such innovated adaptation, thus it could not be tested nor adopted. Some other pupil feedback led to an increased size of response buttons on the online survey. After taking these considerations into the final version of the S-survey, there were 30 questions.
Teachers gave their feedback about the usefulness of the teacher's survey allocation guide. The teachers felt the information presented by medical diagnosis was not helpful.
They explained that it was the typology of support level that makes a difference in comprehension and they could not base the decision solely on diagnosis. Based on this feedback, we refined the teacher's guide to reflect the modern approaches used by teachers to identify the level of support the pupils required by illustrating the three level support pillars. Sampling of the national data collection There were two target groups, special education classes (SEC) and special education schools (SES). Schools were first recruited through the same sampling protocols used in the F-SPA (32) and Finnish data collection of HBSC (38). The F-SPA and HBSC samples are based on national representative school-aged children through two level cluster probability proportion to size. During recruitment of the F-SPA and HBSC studies, the school principals were asked to respond to the invitation with a choice to take part in either the F-SPA or HBSC study (both studies are coordinated at the University of Jyvaskyla). At that notice to participate, the school principals were asked to inform the study coordinator if the school had SEC. Once the coordinator received their acceptance and that the school had SEC, further instructions were sent out to the school principals concerning the requirement for a randomly selected class and SEC from each year group.

Study 4 -National data collection
The second target group were special schools. Initial contact was made with a list of schools (n = 82) of local and national (in Finnish these are called 'Valteri') SES and hospital schools.

Contents of the teacher survey items
Teachers were involved to give evidence of the feasibility to collect PA and Health data through self-report surveys among pupils in SECS. Teachers from the SECS completed an online survey with topics such as, opinions about the teachers' guide for survey allocation, students use of visual images, listing the difficulties the teachers thought the pupils had, and whether pupils followed up with the teachers to discuss any of the topics in the survey.

Feasibility analysis of study 4
Although data collected from study 4 were intended for national reporting, in this paper, we report only the feasibility data from the teachers and the time for survey completion by the pupils. The teachers' feedback is reported descriptively and no statistical analyses were performed. There are different contexts between SEC and SES therefore, it was important to test differences in time for completion. All versions of the surveys were completed online, and the platform stored the time the survey began and ended. We tested 95% of the responses to examine the differences in time completion between the SEC and SES for each survey through Student t-tests. A range between 2.5 and 97.5 percentile for each survey was used so that outliers were removed from this analysis.
Average time and standard deviations were calculated and then Cohen's d effect sizes (39) were used to report the effect of the differences between SEC and SES.

Results from Study 4
The majority of the respondents (n = 889) completed the S-versions (n = 396) followed by the M-version (n = 345) and the least completed the L-version (n = 148) ( Table 1). A total of 27 SEC and 31 SES were involved in the study. Over two thirds of the respondents were males (69.0%).   Table 2.  The remaining teachers (32%) stated that allocation was easy for some pupils and difficult for other pupils. None reported the information was insufficient. A few teachers (n = 7)

SEC -Special Educational Classes, SES -Special Educational
reported there was none of the three surveys were suitable for some of the pupils.
Another teacher reported difficulties to allocate the correct survey, because that a pupil belongs across the spectrum between L, M, and S.
The majority of the teachers (53%, n = 21) reported pupils generally responded well to the surveys, and 40% of teachers (n = 16) felt the pupils were neutral to the survey. Only one teacher reported the pupils did not easily understand the content of the questions, with over half (53%, n = 21) reported the pupils partially understood, and the remaining (45%, n = 18) stated 'yes' that the pupils easily understood the content. Almost all the teachers (95%) felt that the visual images in the survey made it easier for the pupils to understand the items. The remaining 5% could not comment on the visual images, rather than thought the images were not helpful. Not many teachers (13%, n = 5) reported pupils followed up with the teachers by discussing about certain PA and health behaviours appearing on the survey, for example bullying.

Discussion
The culmination of four studies to provide national PA and health data among adolescents in the SECS is the first step of the inclusive data disaggregation strategies in Finland. We started by identifying which areas of interest we could use through combining other national surveys such as the F-SPA (for PA behaviours) and HBSC (for health and health behaviours) so that comparisons could be made between the school contexts for pupils with and without special education support needs. However, prior to carrying out such analyses, preceding steps were identified and reported in this paper through pilot and feasibility studies.
The reduction in the number of items from the F-SPA and HBSC studies led the creation of five different surveys that could be aligned with the three-tier support system in special education (12). Modifications to the text in the questions, the range of response options and use of VAS were the key adaptation principles we used after following the guidelines from Coolican (31). Face validity is an important process for survey development when creating young adolescent self-report surveys (40) and we made sure that each phase was tested appropriately with pupils in SECS. This included the use of the National Easy-to-Read Language service, whereby their experience to simplify the language for all people to comprehend was tested and checked to keep the internal validity of the previous tested items. As a result of these procedures, the fundamental rights of the pupils in SECS on data disaggregation are one step closer to reality. It has been reported that it is preferable to use, as much as possible, existing instruments from other monitoring studies (41), such as F-SPA and HBSC. In our example, we have reported the appropriateness and feasibility of the adapted survey through rigorous scientific methods.
Most of the pupils were able to complete a long survey within a 45-minute class and the time varied between the individual abilities and the survey version. The only time difference we found between educational settings were for completers of the L-O version, whereby more time was spent by pupils in the SECS. The anonymous nature of the data analysis made it difficult to investigate the reasons for these differences. However, this might be explained by the differences between co-teachers' roles in the SEC (more helping child) and SES (more helping teacher) have been noted in earlier studies (42).
Moreover, teachers in SES have more experiences of co-teaching than the SEC (43). The changes in the education system for the SECS and the response to interventions may be more intense at SES than the SEC levels (44). As such, it would be logical that pupils in the SECS require more time to answer questions than in SES. However, further studies and different data collection methods are needed to find out how pupils understood the items and scales. Methodological considerations are particularly crucial when dealing with children with special needs to avoid the risk of exclusion from research (45,46). Both UN's CRPD and the Convention on the Rights of the Child demand acknowledging children and youth with disabilities as equal as everybody else in services provided by society (26). Moreover, the goal of the Non-discrimination Act of 2015 in Finland was to prevent exclusion and expand the obligation to promote equality, including the education providers and scientific institutes responsible for research.
The health and overall well-being of Finnish young adolescents has positively developed according to many indicators during the last decades (47,48). Good examples of these developments include decreased smoking and alcohol use, and also increased physical activity (48,49). Moreover, the increase of obesity seems to be stabilized (47 Therefore, the survey adaptation process as this was necessary, not just for the sake of knowledge base advancement, but for equality of young adolescents.

Study Limitations
There are some limitations to the study to note and could be areas to consider for future implementation of the study. Some population bias may exist throughout the testing as convenience samples were used for the first three pilot studies. Moreover, some pupils were unable to complete the S-version of the survey as it was still too difficult to complete. The context was from a Finnish education system and processes may be limited to Finland. In the fourth study, teachers in the schools administered the surveys and fidelity information on how well teachers and assistants followed the instructions as intended were lacking. Despite our efforts to make the survey as universally available as possible, not all survey instruments were completely accessible. For example, the survey was not converted to a platform whereby students with severe visual impairments could complete it unaided, as we believe this would have led to a separate protocol needed for a teacher assisted survey. The study was also limited to the handful of students who gave their opinions on the appropriateness of the VAS, and more testing on the effect of VAS on survey responses is needed.

Conclusions
The inclusion of young adolescents with disabilities for the large-scale surveys of the whole age group such as F-SPA and HBSC studies must be guaranteed in the future.   Appendix Table 3.pdf