Impact of Mild to Moderate U-shape Hearing Loss on Listening Effort in School-aged Children with Alport Syndrome

In this study, we aimed to investigate the effect of mild to moderate U-shape hearing loss on listening effort in Alport syndrome (AS) children by pupillometry. Subjects were required to answer questions after listening to conversations that simulate real scenes in daily life. We recorded the accuracy rate and pupil data of two conditions: SNR = + 15 dB and SNR = -2 dB. A mixed effect model was established to analyze the inuence of SNR, mid-frequency energy proportion and hearing status on the accuracy and pupil response. The results showed SNR had a main effect on the accuracy. The baseline pupil diameter of AS children was always smaller than normal hearing children. When analyzing the time window including the stages of listening to the conversation, listening to the question and thinking, SNR and the hearing status had main effect on mean pupil dilation. We concluded that AS children with hearing loss were often in a state of low arousal before auditory task. Both hearing status and task diculty have impact on listening effort of AS children. The effort of AS children with U-shape hearing loss might come mainly from consequent cognitive processing (as a consequence of effortful listening) instead of passive listening during speech communication.


Introduction
Alport syndrome (AS) is a rare hereditary disorder that is characterized by chronic kidney disease, sensorineural hearing loss and ocular abnormalities, resulting from mutations in the collagen IV genes [1].
Majority of the X-linked AS male patients experience mild to moderate sensorineural hearing loss, which is manifested in the middle frequency range [2]. Hearing loss mainly occurs at school age and gradually deteriorates with increasing age [2,3]. Therefore, school-aged children with AS typically show a bilateral, symmetrical, mild to moderate, U-shape hearing loss (Fig-1). Their audiogram con guration almost located at the range of 'Speech Banana', which was widely applied to hearing aid tting and rehabilitation in clinics to ensure speech information reception. Accordingly their speech recognition scores are usually good [3].
For school-aged children, successful communicating and learning in the educational environment requires a combination of perceiving, comprehending and inferring the message conveyed by speaker, which involves both speech reception and cognitive processing of the received information. These steps require constant allocation of attentional and cognitive resources [4]. However, classrooms are frequently noisy environments and not always optimal for listening. It is generally accepted that people with hearing impairment have di culties in speech processing under acoustically challenging conditions, which may increase reliance on top-down cognitive processes for compensation and result in the experience of "effortful listening" [5,6]. Thus, it is reasonable to speculate that AS children with hearing loss are at high risk of sustained effortful listening in classroom compared to their normally hearing peers.
This could lead further to signi cant negative impacts in school-aged children [7]. Children develop key cognitive, linguistic, and academic skills in the classroom, however, the increased mental demands of effortful listening may exert deleterious effects on the educational outcomes [8,9]. Furthermore, it was shown in previous literatures that key aspects of auditory cognitive functions are impaired in school-aged children even though with mild hearing loss [10].Therefore, how and when to intervene is a signi cant concern for hearing-impaired children with AS. A better understanding of auditory cognitive ability in these population would be valuable to reduce listening effort and optimize learning performance.
Currently assessment and intervention options of these AS children always depend on auditory test.
However, conventional audiological batteries that utilize pure-tone audiometry coupled with speech intelligibility tests seem to be insensitive to AS children with mild to moderate U-shape hearing loss due to ceiling effects. Moreover, traditional speech recognition tasks mainly provide information about the correct reception of what was heard, but failed to provide a clear evaluation of the "effort aspect" of a patient's hearing experience [11]. Consequently, reliable clinical measure of listening effort could be a more comprehensive means of indexing this important dimension [12,13].
Owing to a surge of interest in the study of auditory cognitive ability of children with hearing loss (CHL), an increasing number of studies have sought to measure listening effort of school-aged children by behavioral tests, physiological measures or self-report scales [7,14,15]. To date, however, none have used any methods including pupillometry to evaluate listening effort in school-aged children with a typical U-Shape hearing loss. For this group of hearing loss children, their audiograms and speech recognition scores all suggest successful speech communication. But it remains unclear whether they still experience high listening effort without hearing aids intervention.
Pupillometry, a new approach for analyzing changes in the size of the eye's pupil, could provide a reliable and sensitive indication of listening effort [16]. The tight relationship between pupil diameter and mental effort has been well documented for over a century [17][18][19]. Pupillometry enables the continuous recording of the pupil diameter, which could monitor moment-to-moment changes of cognitive load during listening task [20]. Changes in pupil diameter were more sensitivity and reliability than skin conductance, reaction time or subjective rating scales [21]. Peak pupil dilation (PPD) have repeatedly been veri ed to be sensitive to speech intelligibility level and masking type [22]. Wendt et al., found that PPD was highest when speech recognition performed in the presence of 1-talker masker in comparison to 4-talker babble and uctuating noise masker [23]. Additionally, PPD changed in a nonlinear way with decreasing SNRs, reaching a maximum value until speech recognition performance was around 40-50% [24]. Mean pupil dilation (MPD) could be used to evaluate how cognitive load is sustained across a given auditory trial, it is more robust compared to PPD in longer stimuli designs and re ects the changes of whole-duration auditory performance more comprehensively [25].
Because pupil measurement requires stabilized-head position, sustained attention and patience, it is relatively hard to access and test children subjects with pupillometry, there are only a few available literatures using pupillometry to evaluate listening effort in school-aged children. Johnson and colleagues measured pupil dilation in children aged 7.5 to 14 years to investigate the extent to which age differences in short-term memory (STM) capacity were related to task engagement during encoding[26]. Steel et al., examined whether binaural fusion reduces listening effort in 11 to 15-year-old children who use bilateral cochlear implants [27]. McGarrigle et al., measured listening effort and fatigue in normal-hearing children (8-11 years old) [28]. These studies showed that it is feasible to use pupillometry to quantify the elevated listening effort in hearing-impaired school-aged children.
Therefore, we conducted our study to evaluate the impact of mild to moderate U-shape hearing loss on listening effort in a group of school-aged children with AS, who warrant more research attention. To approximate classroom-like communication pattern, a listening-answer task was used in the current research. Two listening conditions were created: SNR = + 15 dB and SNR= -2 dB, which represented quiet environment and classroom environment respectively [29]. We hypothesised similar answer performance in both conditions because speech recognition performance were shown to be. Particularly, hearing loss of AS children is characterized as apparent drop in mid-frequency, and energy in the 1k-3k Hz range of targeted sentence were found to be positively correlated with intelligibility [30]. Therefore we also hypothesised that the adverse effect of hearing loss on the listening-answer task and listening effort could depend not only on the level of background noise, but also on the spectral distribution of target sentences. The present study revealed that both hearing status and cognitive ability have impact on the listening effort of AS children. The effort of AS children with U-shape hearing loss might come mainly from the consequent cognitive processing of the degraded speech inputs instead of simply speech reception in daily life.  Fig-2 shows correlation results of each item. The regression analysis showed that MER13 of the stimulus material and MER13 of dialogue were not related to the total number of words and the duration of the stimulus (P > 0.05), which indicates that although the stimulus materials are not consistent in terms of duration and total number of words, they have no effect on the spectrum distribution of sentences.

baseline pupil diameter (BPD)
There was signi cant difference in baseline pupil level between NH children and HI children. The NH children always showed larger baseline pupil level before listening task, regardless of SNR = + 15 dB or SNR = -2 dB (Chisq = 4.67, df = 1, p = 0.04) (Fig-4).

mean pupil dilation (MPD)
A mixed-model ANOVA revealed a signi cant main effect of SNR (Chisq = 2.05, df = 1, p < 0.05) and hearing status (Chisq = 0.29, df = 1, p < 0.05) on the MPD (Fig-7). The analysis also revealed a signi cant interaction effect between hearing status (HI versus NH) and SNR (Chisq = 5.74, DF = 1, p = 0.02), indicating that the response across SNRs varied between listener groups. MPD of HI children was signi cantly increased at -2 dB SNR, compared with that of NH children. However, when SNR = + 15 dB, there was no signi cant difference in MPD between the two groups. There was no signi cant effect of MER13 on MPD (Chisq = 0.44, df = 1, p = 0.51), and no signi cant interaction effect was observed between MER13 and SNR or hearing status. PPD was not signi cantly correlated with hearing status, but was still negatively correlated with accuracy (Table 3).

Discussion
Hearing loss or background noise will increase the cognitive resources required for decoding auditory inputs, leading to effortful listening. [31] . AS children with hearing loss may often experience effortful listening, which could seriously affects the quality of learning and work. In the present study, we acquired the pupil dilation response (BPD, PPD and MPD) during a listening-answer task in two conditions (SNR= -2 dB and SNR = + 15 dB) for NH and AS children. We showed the in uence of SNR and spectral distribution of the test materials on listening effort (as indexed by pupil response) in school-aged children with Alport syndrome. Additionally, the present results also corroborate a close link between cognitive demands imposed by a listen-answer task and task-evoked pupil dilation, which was consistent with research in adults. This provides further evidence of the negative effects of long-term hearing loss on the auditory cognitive ability.
Our ndings extend prior works in two ways. First, we provide evidence that AS children with mild to moderate U-shape hearing loss need more cognitive resources mainly in the stage of thinking not just listening compared to NH children under current experimental conditions. This is the rst study to evaluate listening effort on school-aged children with mild to moderate U-shape hearing loss by pupillometry. This is a special group because according to traditional audiometry battery, this group has audiograms that don't suggest hearing loss in speech-related information, and speech communication is successful. Second, our longer-term speech material and listening-answer task format re ect better ecologically realistic scenarios, which regularly experienced by school-aged children in everyday life. We have also shown that the task could elicit task-evoked pupillary response reliably in HI and NH children. Therefore, this task could reveal the process of cognitive resource allocation in real situation.
Our hypothesis was to explore the in uence of SNR and spectrum distribution on listening effort, however, we found signi cant effect of SNR instead of MER13 on accuracy or pupil response. There was a trend that AS children may have worse performance when listening to auditory materials with more energy in 1k-3k Hz. Note that speech materials used were not processed in the spectral aspect, therefore the difference in spectral concentration between high and low group is natural and small. Although this allows us to use pre-validated speech materials (especially important when this task format is new), it might explain the insigni cant trend observed in the effect of MER13. We assumed that AS children might have signi cantly larger pupil size and worse performance if manipulate the spectral features of the speech materials in a more causal experimental design.
In our study, we used two kinds of time window to analyze pupil responses. In the time window containing only listening to dialogue, no matter SNR = -2 dB or SNR = + 15 dB, there was no signi cant difference of pupil size between AS children and NH children. The results of the mixed effect model also suggested that SNR, hearing status and MER13 had no signi cant effects on PPD and MPD, indicating that AS children might not need to provide much extra cognitive resources for the expenditure of intense effort at this stage. This is consistent with the results from clinical audiograms and speech recognition tasks that AS children are generally successful in speech communication.
However, when looking at the time window including listening dialogues, questions and thinking, larger pupil size were observed in AS children at -2 dB SNR compared to their NH peers, suggesting that extra effort might appear especially at the stage of thinking for AS children when the task is di cult. In this time window, the subjects need to re-extract the dialogue content they heard, connect it with the questions and make decisions after reasoning or calculating at this stage. The results show signi cant main effect of SNR and hearing status on MPD, which is more robust compared to PPD in longer stimuli designs [32] . MPD gives further information for the sustained cognitive load in a given time window [25]. Larger MPD in this time window represents constant allocation of cognitive resources in AS children. Increasing SNR will result in reduction of MPD, which reminds us that intervention might release effortful listening.
The framework of ELU (Ease of Language Understanding Model) may propose an explanation to our results [33] . In time window including listening dialogues, questions and thinking, the listeners needed to store the sentence and reconstruct information that were either missed or distorted. When there is a mismatch between the input signal and the long-term memory, explicit processing needs to be initiated, which depends on cognitive abilities such as working memory capacity and e ciency [34] . This explicit processing requires extra cognitive resources from the limited pool of resources that would otherwise be used for other mental tasks (i.e., calculation, deduction etc). Therefore we see a lower score of AS children in answer correct at di cult SNR conditions because they could not invest as many cognitive resources to perform the mental processing to get the correct answer AS children also need to allocate more cognitive resources in general, which is re ected by larger pupil size.
Note that without utilizing this new behavioral task that involves both listening and associated mental processing, we might not observe this difference between AS and NH children in both behavioral and physiological responses. Ecologically, it is rare to nd a speech communication scenario where only passive listening and repetitions are required, as in some previous listening effort studies. Typically, listeners need to decode the speech inputs, process the information and prepare a response accordingly to either extract more information or giving more information. All of these require cognitive resources from a limited pool or resources. Extra resource needs at the stage of speech comprehension would impair the accuracy and e ciency of later stages, leading to poorer social communication, learning and development. This highlights the importance to investigate ecological speech communication scenarios in order to identify groups at higher risk of cognitive development.
It is widely accepted that pupil response have a close relationship with cognitive function [17], but the exact relationship between the ability to allocate cognitive resources and cognitive function remains unclear. According to the hypothesis proposed by Van der [35] , our results are more consistent with the second hypothesis, namely, people with better cognitive function utilize cognitive resources more e ciently, so they only need to invest relatively a few cognitive resources when executing a task, it is suggested that long-term hearing loss lead a negative effect on the e ciency of utilizing cognitive resource. Therefore, AS children need to devote more cognitive resources to cope with the current task, and show larger mean pupil dilation due to low e ciency of utilizing cognitive resources.
As indicated, some researchers considered that BPD can re ect the arousal state of the cognitive system [36] . Gilzenrat believed that larger BPD before the experiment represents an over-arousal state of the cognitive system, indicating that the subjects are anxiety or stressful [37] . Moreover, Alhanbali found that there was a signi cant correlation between BPD and fatigue scales [38] , smaller BPD represents low arousal state of the cognitive system, indicating great fatigue of the subjects. According to the Yerkes-Dodson Law, both over arousal and low arousal are associated with poor performance of cognitive task, while arousal levels between these two extremes are expected to have the best performance [39] [40] . Animal studies have also revealed an inverted U-shaped relationship between task performance and arousal levels. McGinley found that optimal signal detection behavior and sound-evoked responses of mice occurred only at intermediate arousal when membrane potentials were stably hyperpolarized, instead of low arousal or over arousal [41] . In our study, BPD of AS children are smaller in all listening conditions compared to NH children, suggesting they suffered from auditory fatigue when processing auditory cognitive. The negative impact of long-term hearing loss was their cognitive system are often in a state of low arousal. Recent ndings indicated that long-term hearing loss of AS children affects cognition in two aspects, one is e ciency of utilizing cognitive resource, the other is arousal state, and both of them further affects listening effort (Fig-9).
There are several limitations of the present study that need to be mentioned. First, we only measured objective listening effort but no specialized test for cognitive abilities such as working memory capacity and e ciency. We might have gained more insight into the associations between listening effort and cognitive function if we had also included variables controlling for individual differences in cognitive abilities. Second, we did not manipulate experimentally the spectral distribution of the speech materials but rather choose to use pre-validated and more natural stimuli. The results of the study only proved that the MER13 had an in uence on the behavioral test, but no signi cant results were found. Strictly modulating of spectral distribution might be improved in future studies.

Participants
A total of 33 boys aged 8-16 years were included in this study: 22 were AS children (mean age = 12.21 ± 0.66 years) with a U-shape hearing loss (Fig-10, PTA 0.5k−4k =45.12 ± 8.774 dB HL, PTA 1k − 2k =50.18 ± 10.25 dB HL) and 11 were age-matched normal hearing (NH) children (mean age = 11.64 ± 0.52 years) with pure tone audiometric thresholds con rmed to be lower than 20 dB HL at 500, 1000, 2000 and 4000 Hz. All children were native Mandarin speakers and had normal vision. Written parental consent and verbal assent from children were obtained prior to testing. Peking University First Hospital Research Ethics Committee approved study (reference number: 2020-068).

Equipment
The participant was seated 60 cm away from a 19-inch at-screen computer monitor at 0 azimuth, which displayed the visual reminder of stimulus onset and response prompt. Stimulus presentation was programmed using the Matlab software. We presented auditory stimuli through around-ear headsets (HDA200). Pupil size was recorded using an eye camera, which has a sampling rate of 100 Hz (https://pupil-labs.com/products).

Speech stimuli
A total of 50 listening-focused and age-appropriate speech conversations were created for experimental trials. These short conversations were taken from the Youth Chinese Test (YCT), which was developed and validated by Confucius Institute Headquarters to assess primary and middle school students' abilities to use Chinese in their daily lives (http://www.hanban.org).
The comprehension di culty level of these short conversations was equivalent. Each conversation contained three sentences consisting of 30-50 words in total. Each conversation lasted between 9s and 13s. The rst two sentences of the conversation introduced the scene, which were recorded respectively from a native male or female speaker of standard mandarin. The third sentence raised a question referred to the conversation. (e.g. Female: Hurry up, it is half past eight, the train is about to leave. Male: Don't worry, there are still half one hour left. Question: What time does the train leave?) Therefore, different from speech materials used in past listening effort studies that requires only listening and repeating the sentences, this listening-answer task requires participants to both understand the conversation and perform further mental processing of the received information to provide the correct answer.
All sentences were recorded in an anechoic chamber and at a natural pace, from native male and female Mandarin speakers. To evaluate the impact of energy in mid-frequency on listening effort of AS children, Praat was used to divide these conversations into two categories according to the proportion of midfrequency: Group A with more energy in mid-frequency (1000 Hz to 3000 Hz), Group B with less (Fig. 11). The performance results obtained by participants in pilot study further suggests that these short conversations were suitable for children aged 8-16 years.
To investigate the impact of SNR on listening effort, a background noise le consisting of multi-talker babble was digitally mixed with the speech stimuli to create two listening conditions: easy and di cult. The SNR for the easy condition was + 15 dB, while for the di cult condition was − 2 dB. Multi-talker babble was used as the masker because informational masking is believed to have consequences of increased cognitive load [42] . Overall output level for both listening conditions was xed at 65 dB SPL.

Design and procedure
On arrival, participants were seated comfortably in the soundproof booth. The luminance (206 lux) of the visual eld was controlled by adjusting room lighting and screen brightness levels. Pupil calibration was then performed to ensure that raw eye data could be accurately mapped. Then, participants were given the following instructions: ''During the listening task, you will hear some short conversations. Please xate on the cross in middle of the screen and try not to blink or move suddenly. After each conversation, you will also hear a question. Shortly, a "o" sign will appear on the screen. Please try your best to answer as quickly and accurately as you can at the sight of the "o" sign. Try to look straight ahead at the cross in the middle of the screen while you listen to each conversation. It's important that you must pay attention to the whole conversation." Participants were given the opportunity to perform a practice session before beginning the recorded experiment, which ensure the children were adequately familiar with the task and were able to perform the task accurately.
Overall, each participant was presented with 2 easy blocks and 2 di cult blocks, each block consisting of 10 conversations (40 conversations in total). The order of easy and di cult blocks was random. Participants began each trial by xating on a "+" sign shown in the center of the screen. Background noise (multi-talker babble) was then presented. After 2s of noise-alone presentation, the conversation began. Each conversation was also followed by 1s of noise-alone presentation. Following the end of the conversation was a question recorded from a native female speaker. Then, after a 2s period for thinking, a "o" sign appeared on the screen and children answer the question. (Fig-12) 4.5 Analysis

speech materials feature summary
To check whether the word number and duration of the stimulus material have in uence on the spectrum distribution, we conducted a correlation analysis on some items of stimulus materials, including (1)mean word number of stimulus materials, (2)mean duration of short dialogue (MD of dialogue), (3)mean duration of the question (MD of question), (4)mean energy ratio of short dialogue at 1k Hz to 3k Hz (MER13 of dialogue), (5) mean energy ratio of question at 1k Hz to 3k Hz (MER13 of question). This is to identify potential confounds in speech materials features that could impact listening-answer task performance and listening effort.

pupil data
The analyses of pupil data were consistent with those described in the literature by Yue Zhang [43] and Matthew B [44] . The pupil diameter measured from the dialogue onset to the end of the trial was divided by that baseline level to obtain relative changes in pupil diameter elicited by the task.
Pupil diameter data were pre-processed in Matlab to detect and remove blinks and gaze displacement. Sample points were coded as blinks when pupil diameter values were below 3 standard deviation (SD) of the mean of the unprocessed trace or when gazing positions were 3 SD away from the center of the xation., because they can affect the validity pupil diameter measurement. Traces between 10 data points (0.1s) before the start and after the end of blink were interpolated cubically in Matlab, to further decrease the impact of the obscured pupil from blinks. Trials where blinks and missing data exceeded 20% of the total samples were considered invalid and excluded from all analyses. No participants were excluded from the analysis. All valid traces were then smoothed using a rst-order Butterworth lter and downsampled from 120 Hz to 30 Hz. Processed traces were then aligned by 1) the onset of the response prompt (the display of circle to signal participants to answer the question) or 2) aligned by the offset the dialogue. These two alignment measures provide two window of analysis that contains: 1) listening to the dialogue and question with extra time for preparing the answer and 2) listening to the dialogue. The aligned traces were then aggregated per listener by each condition.
All statistical analyses and plotting were performed in R 4.0 (https://www.rproject.org). We applied linear mixed models (LMM) to analyze the data as LMMs tolerate missing values, while repeated measures ANOVA tests only use complete cases contrary to multilevel analyses. Moreover, mixed-effects models are more exible in processing the multilevel structure of the data. To examine the effect of SNR, hearing status and MER13 on PPD, a mixed-effect model was tted on listeners' word recognition, using SNR, hearing status and MER13 as xed effect factors, with LISTENER and material list as random effect factors. Mixed effect models allow for controlling the variance associated with random factors without data aggregation. Therefore, by using LISTENER and WORD LIST used for stimuli as random effect factors in the model, we controlled for the variance in overall performance (random intercept) and dependency on other xed factors (random slope) that were associated with LISTENER and material LIST. Models were constructed using the lme4 package in R, and gures were produced using the ggplot2 package. Fixed and random effect factors entered the model, and remained in the model only if they signi cantly improved the model tting, using Chi-squared tests based on changes in deviance (p < 0.05). Differences between levels of each factor and interactions were examined with post-hoc Wald test. p values were estimated using the z distribution in the test as an approximation for the t distribution. In the same way, linear mixed models were used to analyze the effect on MPD and accuracy. Finally, packages lm was used to analyze the correlation between age, PPD, MPD and PTA1k-2k, PTA0.5k-4k, the accuracy of answering questions of all subjects.

Declarations
Statement Figure 1 A typical audiogram of school-aged children with AS. This audiogram shows a bilateral sensorineural hearing loss, overlapping with the Speech Banana. change in baseline pupil diameter (BPD) before the listening task for HI and NH children in two conditions. Error bars represent 1 standard error of the mean.
Page 20/24 Figure 5 visualization of pupil size when aligned by the onset of response in two conditions. Red curve represents NH children and black curve represents HI children, with the shaded width indicating 1 standard error from the mean.  Baseline-corrected mean pupil dilation across different SNRs and MER13 for NH and HI children. Dot represent the standard error of the mean. Figure 8 visualization of pupil size when aligned by the onset of response in two conditions. Red curve represents NH children and black curve represents HI children Figure 9 The relationship between hearing loss, cognitive ability and listening effort in AS children.

Figure 10
Audiogram of AS children. Averaged pure tone hearing thresholds of bilateral ears across 0.25k, 0.5k, 1k, 2k, 4k and 8k Hz for the AS children. Error bars show the standard deviations of the mean.