Participants
Thirty-nine girls with RTT, ranging from ages 3 to 24 years (mean = 9.8 years) were recruited by the Italian Rett Syndrome Association (AIRETT). Demographic, developmental, clinical, behavioural, and genetic information was collected from all available sources such as parent/caregiver reports of past history, current behaviour and features, and latest clinical reports. Table 1 summarizes the characteristics of all participants.
Technological Architecture
The Interactive School communication architecture leverages on the Cisco Webex conferencing system. The reason why we chose this system is that, at the time the Covid-19 emergency started, we were already using this platform for a telerehabilitation project [11]. This timing coincidence helped us to immediately start with the Interactive School, since educators and therapists had already acquired skills to manage a videoconference with RTT children, though with a different purpose. In fact, during a telerehabilitation session the therapist supervises activities that are conducted locally by the patient with the help of a caregiver. On the contrary, during an Interactive School lesson, the educator administers several activities by means of multimedia material (such as presentations or videos) or asks for physical movements (e.g. play a tambourine). Then, RTT children have to respond to these stimuli, helped by the caregiver, and the interactions and levels of attention can be observed by means of an eye-tracker.
In Figure 1, the overall architecture is shown.
As already mentioned, during a lesson the educator administers multimedia material through the videoconferencing software. Multimedia material varies, each one requiring a different level of interaction by the RTT children. In Figure 2, we illustrate what is shown by the educator for each lesson activity. While the educator shares a video (e.g. a cartoon), each one of the connected RTT children sees the same content (Figure 2 - A). On the child’s side, the interaction is acquired by the eye-tracker. Data on this interaction will be subsequently used for attention analysis. During the slide sharing, the educator shows some content and asks the children to answer a question, by choosing an image on the screen (Figure 2 - B). In this case, the caregiver in turn shares the screen with the educator, so that the educator can see the choice of the child by looking at the cursor on the screen, moved through the eye-tracker.
Finally, when interaction between the children is needed, the educator switches to grid view, in such a way that each participant can see the other participants. This occurs, for instance, when the educator asks the children to greet one each other (Figure 2 - C).
Procedure
The Interactive School was composed of social and cognitive interaction intervals. With reference to the social interaction intervals, at the end of the opening multimedia presentation as outlined above, the teacher expanded the video of each participant in turn and invited them to introduce themselves according to their skills. With reference to the cognitive interactions, after the second social interaction, the video of a cartoon story was presented. The cartoon changed in each section and it was extracted from famous animation movies, such as “Heide”, “Minnie”, “Mary Poppins” and calibrated according to the comprehensibility of the story. Each cartoon lasted 2:30 minutes. At the end of the cartoon sequences, a recognition test was carried out for each participant. They were asked to perform, in turn, immediate recall of the cartoon with a recognition test composed of 10 questions regarding the story. For each question, two pictures were presented on the screen, the correct answer and the distractor answer. The scoring standard used in the present study involved giving 1 point for choosing the correct answer, and 0 points for choosing the distractor. The total time of each session was about 20 minutes. Before starting the experiment, attention and stereotypies were measured in a without tasks or social engagement condition in which participants accessed the platform and waited for the start of experimental tasks for two minutes.
Data analysis
Eye-tracking. Within each stimulus, a squared area of interest (AOI) around the target was defined. The size of the AOI covered a visual field of about 19 degrees.
For each AOI, relative to each stimulus, the fixation length (FL) was measured, which is the amount of time (seconds) spent by the girl when looking at the target. Fixations were extracted using a threshold of 100 ms.
Measures
Behavioural and cognitive measures. With reference to the general behaviour of the patients with RTT the parameters were:
- Number of seconds of attention (fixation length) to social and cognitive tasks for a maximum time of 10 minutes (600 s);
- Time spent in stereotyping in the absence of stimulation, in social and cognitive tasks for a maximum time of 10 minutes (600 s).
Social communication task. With reference to social communication the parameters were:
- FL of the participant on the teacher during the assignment of tasks or during reinforcements or singing a song (Figure 3);
- FL of the same participant on the first girl that was invited to move or reply when she was called by the teacher;
- FL of the same participant on the second girl that was invited to move or reply when she was called by the teacher;
- FL of same participant on the third girl that was invited to move or reply when she was called by the teacher;
- FL of same participant on the fourth girl that was invited to move or reply when she was called by the teacher.
Cognitive tasks. With reference to cognitive tasks the parameters were:
- FL of the participant on the main character of the cartoon
- FL of the participant on the PC screen but not on the main character
- FL of the participant outside of the PC screen
- Number of correct replies to the questions on the cartoon. The recognition test was based on 10 questions regarding the story posed immediately after seeing the cartoon.
More in detail, with reference to the cognitive task, FL was computed in the following way (Figure 4). In both tasks, FL refers to the amount of time (seconds) spent by the subject when looking at the correct stimulus. Fixations were extracted using a threshold of 100 ms.
Statistical analysis
Data were analysed using SPSS version 24.0 for Mac. The descriptive statistics of the dependent variables were tabulated and examined. Alpha level was set to 0.05 for all statistical tests. In the case of significant effects, the effect size of the test was reported. To verify the effects of the considered variables in this study, the ANOVA repeated measurement design was carried out and Fisher's test was used. The relationship between variables was firstly evaluated by determining Pearson’s r. Secondly, linear regression analysis was performed to evaluate the correlation between FL and correct replies (CR). The correlation coefficient β was used for linear regression analysis. The following guidelines proposed by Chan [26] were used to assess the strength of the linear relationship: poor (β < 0.3), fair (β 0.3–0.5), moderately strong (β 0.6–0.8), and very strong (β ≥ 0.8).