Atypical and Non-spontaneous Attentional Control in “just Look” Tasks for Face Recognition in ASD Children Revealed by Gaze Tracking Pupillometry

We hypothesized that abnormalities in social interaction and executive function may be related to uctuations in pupil diameter, which reect norepinephrine activity in terms of attentional function. We adopted “just look” tasks to examine spontaneous changes in attention. Twenty children with autism spectrum disorder (ASD) and 39 typically developing (TD) controls participated. Intragroup comparisons of differences in pupil diameter changes during a shift from a scrambled image to the original image (task 1-a), xation on faces, letters, and geometric patterns (task 1-b), and pupil diameter changes during a shift from a nonsense image to a face-like image (task 2) were performed. In task 1-a, ASD children had prolonged pupil dilation after the shift in images, whereas the pupil contracted in TD children, indicating decits in attentional disengagement in ASD children. In task 1-b, ASD children preferred geometric patterns over faces. In task 2, the rate of pupillary dilatation during the shift in images was lower in ASD children than in TD children. Therefore, ASD children appear to have abnormalities in spontaneous attention to faces, which function automatically in TD children. In conclusion, atypical attentional function may contribute to the manifestation of abnormalities in social interaction and executive control in ASD.


Introduction
What we direct our attention to automatically or intentionally in uences brain and behavioral development. Atypical attentional function has been suggested as one of the earliest characteristics of infants who are later diagnosed with autism spectrum disorder (ASD) [1]. Therefore, a more detailed understanding of the development of attentional mechanisms in children with ASD may clarify how their attentional abnormalities affect the formation of the core symptoms of ASD.
Many researchers have used an eye tracker that captures the eye gaze and measures pupil diameters to examine attention function. Pupil diameter changes not only in response to a light stimulus, but is also in uenced by cognitive and autonomic activities because that pupil dilation is modulated by norepinephrine (NE) released from the locus coeruleus (LC). The LC is the sole source of NE, and a cluster of neurons located in the rostral pons that sends projections to widespread brain regions, particularly dense in areas known to be important for attentional processing, such as the parietal cortex, the pulvinar nucleus of the thalamus, and the superior colliculus [2]. The LC-NE system plays an important role in regulating cortical arousal levels [3], the focusing of attention, executive functions, and memory consolidation [4]. Animal and human studies demonstrated that pupil diameter was directly related to moment-to-moment uctuations in the activity of norepinephrinergic neurons in the LC [5]. A recent study con rmed this using simultaneous recordings in rhesus monkeys of pupil diameters and neuronal ring of the LC. The ndings obtained showed that LC spiking activity and pupil diameter changed with exerted effort by monkeys [6]. In humans, combined pupillometry-fMRI studies revealed temporal coupling between LC BOLD signals and pupil dilation [2,7]. Combined pupillometry-fMRI allows for assessments of moment-to-moment uctuations in physiological activity, subjective task di culty, mental effort, and attention based on measurements of pupil diameters. Many neuroimaging and event-related potentials have been reported for ASD in adults and adolescents. Although these methods revealed differences from typically developed subjects in complex tasks, they are di cult to apply to pediatric ASD and their ndings do not always explain the pathology in childhood. Based on the hypothesis that there is a pathology of spontaneity in ASD [8] (particularly regarding attention), we herein examined changes in the task of "just looking" without any instruction.
The aim of the present study considered two major symptoms of ASD, dysfunctional social interaction and executive controls, from the point of attentional function using an eye tracker during two separate face processing tasks.

Results
Task 1-a; Intragroup comparison of the percent change in pupil diameters with a shift from a scrambled image to the original image Figure 1 shows "raw" representative pupil diameter data from one child each in the ASD group and TD group during task 1-a. During the shift from the scrambled image to the original image, the percent change in pupil diameters was signi cantly lower in the ASD group than in the TD group (Faces/Letters; TD 7.43 ± 5.04%, vs ASD 1.73 ± 7.30%, p=0.003. Faces/Geometric patterns; TD 11.23 ± 6.97% vs ASD 7.07 ± 6.36%, p=0.026). In the TD group, the pupil dilated while looking at the scrambled image and contracted during the original image phase. On the other hand, in the ASD group, the pupil dilated during the scrambled image phase and pupil diameters remained almost unchanged after the shift to the original image (Table 1). Task 1-b; Differences in visual preferences for faces or letters/geometric patterns in children between groups Heatmaps revealed the density of xations by children in each group ( Figure 2). The duration of xation on faces was signi cantly longer in the ASD group than in the TD group when they were presented with still images consisting of children's faces and geometric patterns (image D) (TD 1875 ms vs ASD 1126 ms, p=0.007). The ratio of face gazing duration to total gazing duration was greater in the ASD group than in the TD group (TD 52.96% vs ASD 37.61%, p=0.047). On the other hand, when the groups were presented with still images consisting of children's faces and letters (image C), no signi cant differences were observed in the duration of xation on faces and the percentage of face gazing duration to total gazing duration between the 2 groups (duration of xation on face; TD 2693 ms vs ASD 2101 ms, p=0.122, the percentage of face gazing duration; TD 75.83% vs ASD 70.65%, p=0.594) ( Table 2). Task 2; Intragroup comparison of the percent change in pupil diameters with a shift from a nonsense image to a face-like image Figure 3 shows "raw" representative pupil diameter data from one child in the two groups during task 2. During the shift from the nonsense image to the face-like image, the percent change in pupil diameters was signi cantly smaller in the ASD group than in the TD group (TD -7.36 ± 6.14% vs ASD -4.19 ± 6.85%, p=0.047) ( Table 3).

Discussion
Task 1-a; Hyperarousal and de cits in attentional disengagement in the ASD group This is the rst study to report impairments in visual disengagement and spontaneous attentional modulation by a change in pupil diameter during a task in children with ASD. Since pupil dilation is a measure of mental effort and task-directed attention due to LC-NE activity, pupils in the TD group dilated while looking at the scrambled image, which required mental effort to perceive, and then shrank after the shift to the original image, which did not need as much attention. On the other hand, children with ASD had exaggerated pupil dilation after the shift to the original image that was easy to perceive. This result implied that ASD children had exaggerated LC activity and focused attention during this task as well as de cits in visual disengagement, which is consistent with previous ndings showing the attentional disengagement of ASD individuals based on eye movements [1]. Impairments in disengagement and a shift in attention to both social and non-social auditory and visual stimuli have been reported in infants and children with ASD [9,10]. Previous studies demonstrated that children diagnosed with ASD had prolonged visual xation and impaired visual disengagement from the early stage of infancy [10,11].
LC neuron activity is divided into tonic and phasic components, and the LC-NE system plays a role in the regulation of behavior, not only in sensory processing or the regulation of arousal. Phasic LC activation is driven by the outcome of task-related decision processes and helps to optimize task performance, whereas tonic activation is associated with disengagement from the current task and facilitating behaviors for the exploration of alternative sources of reward [12]. Direct neuronal recordings in monkeys and neuroimaging studies in humans revealed prominent descending projections to the LC from the orbitofrontal cortices and anterior cingulate that played critical roles in evaluating rewards and costs, respectively. These frontal areas receive inputs from a wide array of sensory-motor areas and assess task-related utility. These areas drive transitions between phasic and tonic modes, similar to driving the LC toward the phasic mode for task-associated rewards and toward the tonic mode for tasks with diminishing utility [12]. Task 1-a in the present study required participants to "just look at the screen"; therefore, it mainly evaluated the tonic activity of the LC-NE system. Exaggerated and persistent pupil dilation in ASD children may re ect the dysfunctional tonic activity of this system. Early signs of in exibility appear to be consistent with executive dysfunction and repetitive and stereotyped behaviors in ASD [1,13]. Task 1-b; Preference for geometric patterns and de cits in face-selective memory, eye-speci c discrimination, and facial emotion recognition in the ASD group The ASD group spent less time attending to faces and more time looking at geometric patterns than the TD group, which is consistent with previous ndings [14,15]. It has been suggested that individuals with ASD attempt to avoid eye contact, which is aversive and socially threatening to them [1,16]. Another hypothesis is that de cits in face processing by individuals with ASD may contribute to their avoidance of the face. A review of face recognition reported that adults and children with ASD exhibited marked impairments in face-selective memory and eye-speci c discrimination; however, in most studies, markers of typical face processing, such as the face inversion effect and the part-whole effect, appeared to be intact [17,18]. In the majority of studies on face memory, participants with ASD and neurotypical participants viewed several faces and objects and then performed a memory test, such as the old-new recognition test [17]. The ndings obtained revealed immediate or long-term memory de cits selective for faces in participants with ASD. Regarding discrimination, only slight differences were observed between participants with ASD and neurotypical participants who performed ne-grained face perception tasks requiring the discrimination between two or more faces [17]. Therefore, individuals with ASD appeared to have speci c de cits discriminating the eyes, even in tasks without a memory demand.
In addition to these de cits, ASD individuals were found to have an impaired ability to perceive and interpret others' facial expressions, and intentions conveyed by observed gaze shifts [19,20]. Many functional MRI studies have been conducted on facial emotion processing, and the ndings obtained suggested hypoactivity of the core brain regions considered to be related to face processing in individuals with ASD, such as the fusiform area [21,22], amygdala [19,22], and posterior superior temporal sulcus [22,23], when identifying or viewing emotional expressions. Brain imaging studies showed that individuals with ASD used compensatory mechanisms to recognize facial emotions [24][25][26]. These ndings suggested that facial emotion recognition by ASD individuals was more effortful and cognitivebased, that is, non-spontaneous, than neurotypicals whose recognition was more autonomic [19].
Therefore, children with ASD appear to avoid paying attention to faces, particularly the eyes, which are harder to perceive than geometric patterns. Since faces, particularly the eyes, are crucial social stimuli, and face processing is a window of social interaction, these de cits in individuals with ASD may result in a loss of information that is important for the development of appropriate social functioning.
On the other hand, one of the reasons for the lack of signi cant differences with image C may be that the characters, including kanji, were too di cult for children of this age to recognize. This may be inferred from the duration of xation on letters being shorter than that on geometric patterns, even among TD children.
Task 2; Abnormalities in spontaneous arousal and attention to face-like objects that were perceived similarly to real faces in the ASD group A comparison of the percent change in pupil diameters with a shift from the nonsense image to the facelike image between the TD and ASD groups showed that pupil dilation was greater in the TD group than in the ASD group. This result indicated that the ASD group increased their arousal and attention towards face-like gures to a lesser extent than the TD group. As described above, ASD children exhibit a stronger preference for geometric patterns than TD children. Electrophysiological and functional neuroimaging studies showed that humans perceive a face in non-face stimuli based on a global face-like con guration. In these studies, the stimuli of face-like objects activated the fusiform face area, similar to faces [27][28][29]. A previous study also demonstrated that children with ASD holistically perceived a facelike object as a face [30]. In that study, preschoolers with ASD were presented with an upright face-like object and its inverted version side by side to investigate orientation and maintenance towards the upright object. ASD and TD children both spent more time looking at the upright object, which indicated that they perceived it as a face. However, ASD children were more likely to direct their rst xation towards the inverted object than TD children, suggesting a de cit in orientation by ASD children towards the face. In task 2 in the present study, we demonstrated abnormalities in spontaneous arousal and attention by ASD children to face-like objects, which were perceived similarly to real faces based on pupil diameter measurements. Similar to task 1-b, the reason why this attention was not evoked as much in the ASD group as in the TD group may have been that faces are hard to perceive and are socially threatening for ASD individuals. Based on these results, we suggest that children with ASD have de cits in spontaneous arousal towards faces, which functions automatically and unconsciously in TD children.
The present study has several potential limitations that need to be addressed. Since the sample size was small, further studies are needed. Moreover, the ASD and TD groups were not matched for IQ or severity of symptoms. However, ASD children with very low IQ or severe symptoms did not participate in the present study because ASD children who were not successfully calibrated were excluded. This may cause a selection bias. Therefore, the results of the study were obtained from a limited population intelligent enough to perform the task. Despite these limitations, we consider it important to clarify the issue of spontaneity in the control of attention as the neurobiological basis in childhood ASD with the task of "just looking".
In conclusion, the present results provide evidence for atypical attentional control in ASD individuals, and may contribute to the manifestation of the core symptoms of ASD, such as abnormalities in social interaction and executive control. The eye tracker is a simple, non-invasive, and real-time tool for clarifying neural mechanisms, even in pediatric ASD.

Methods
Participants TD controls were recruited from a prospective birth cohort study (the Seiiku Boshi Cohort), which included mothers and babies who received medical care for pregnancy and delivery at the National Center for Child Health and Development between December 2010 and April 2013. Subjects with ASD were enrolled from patients who came to our hospital for diagnosis and treatment and the previously described cohort. The nal diagnosis for participants with ASD was con rmed with DSM-5 by a Boardcerti ed pediatric neurologist with more than 20 years' combined experience in ASD. Fifty-nine children aged 5 to 11 years old, comprising 20 with ASD (ASD group; 14 males, 6 females; mean age ± SD 7.0 ± 1.3 years; age range 5.9-11.0 years) and 39 TD controls (TD group; 19 males, 20 females; mean age ± SD 6.7 ± 0.7 years; age range 5.9-7.8 years), participated. The Mann-Whitney U test was used for comparisons between groups of sex and age, and there were no signi cant differences (P-value, sex p=1.99, age p=0.70). None of the participants were taking medication that may affect the NE system, for example, medicine for attention-de cit activity disorder. Informed consent was obtained from the parents or guardians of all participants. The present study was performed in accordance with the Ethical Guidelines for Medical and Health Research Involving Human Subjects, and was approved by the ethics committee/ institutional review board of National Center for Child Health and Development (number 2020-319).

Apparatus and pupil data acquisition
A Tobii X2-30 eye tracker (Tobii, Danderyd, Sweden, www.tobii.com) was used to measure children's xations and pupil diameters at a sampling frequency of 30 Hz in response to a visual stimulus. Participants were seated in front of the eye tracker at a distance of 40-60 cm and a visual angle of 5.7 degrees from the monitor. The lights were on during tasks, and illumination of the room was kept constant for all participants. The eye tracker was calibrated before trials using a ve-point calibration screen. The procedure was repeated until the participant had successful calibration ascertained by an automated validation procedure. Participants were excluded if they were not successfully calibrated.

Tasks
The images shown below were all produced on the monitor. The task for participants was to just look at the images and no other instructions were given, which eliminated extraneous attention and allowed an assessment of performance based on spontaneity. Task 1-a; Children were presented with scrambled still images consisting of children's faces and letters (image A) or those of children's faces and geometric patterns (image B) for 5 seconds, followed by the original images (image C or D) (Figure 4). Informed consent was obtained from the parents or guardians of all children in these images for publishing their images in an online open-access journal. We presented images in the following order: image A, C, B, and D without a pause between images. To evaluate each group's response to images that are di cult and easy to recognize, the percent change in pupil diameters between stimuli was calculated. We adopted the mean pupil diameter measured during the last 2 of the 5 seconds of gazing at each image to minimize the effects of a change in the intensity of light. Regarding the pupil light response, the time interval between a light stimulus onset and when the pupil reaches its minimal diameter in the TD and ASD groups was previously reported to be within 1.0 second [31,32]. The total time taken by the pupil to recover to 50% of the maximal pupil diameter from peak constriction was shown to be within 1.0 second [31], while that of healthy adults to recover 75% of the maximal pupil diameter was approximately 1.7 seconds [33,34]. Regarding dark adaptation, pupil dilatation was previously shown to plateau 3 seconds after the lights had been turned off [35]. Task 1-b; In addition to the above analysis of pupil diameter changes, data on xation duration while showing images C and D were calculated to identify which stimulus was each child's preference. These data were compared between groups. Task 2; Children were presented with a nonsense gure consisting of 6 red dots (image E, Figure 5) for 5 seconds. The processed face-like gure consisting of three dots and its upside-down image are presented side by side (image F, Figure 5). To assess the difference in each group's response to the nonsense and face-like gures, the percent change in pupil diameters between stimuli was calculated. Similar to task 1a, we adopted the mean pupil diameter measured during the last 2 of the 5 seconds of gazing at each image.
All participants completed these tasks on the same day. Spatiotemporal eye xation data and pupil diameters were recorded using Tobii software (using a 35-pixel radius lter).

Statistical analysis
Participants without successful calibration were not included in analyses. Regarding the pupil data of each participant, the pupil with less noise from bilateral eye data was selected by visual inspection. Spatiotemporal eye xation data and pupil diameters were analyzed by built-in Tobii software (using a 35-pixel radius lter). The Mann-Whitney U test was used for comparisons between groups of the duration of xation on faces, letters, and geometric patterns in task 1-b, and the percent change in pupil diameters in task 1-a and task 2. A p-value <0.05 was considered to indicate a signi cant difference.

Declarations
Data availability statement No datasets were generated or analyzed during the current study. Tables   Table 1 Intragroup Figure 1 "Raw" representative data of pupil diameter uctuations from one child each in the ASD group and TD group during task 1-a During the shift from the scrambled image to the original image, the rate of pupillary contraction was lower in the autism spectrum disorder (ASD) group than in the typically developing (TD) group. In the ASD group, pupil dilation continued even after the shift to the original image.

Figure 2
Heatmaps of on-screen xations between groups in task 1-b Heatmaps revealed the density of xations of children in each group. Regarding image C, both groups gazed more at faces than at letters.
Concerning image D, ASD children xed their eyes more on geometric patterns than on faces, whereas TD children showed a preference for faces over geometric patterns. "Raw" representative data of pupil diameter uctuations from one child each in the ASD group and TD group during task 2 During the shift from the nonsense image to the face-like gure, the rate of pupillary dilatation was lower in the ASD group than in the TD group.

Figure 4
Images presented in task 1-a Images A and B were scrambled images of images C and D. Children were presented with these images in the following order: image A, C, B, and D. Images presented in task 2 Image E was a geometric pattern with 6 red dots. Image F was a face-like gure processed to image E. Participants were shown image F after image E.